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(57) Abstract 

A method of locating an inhibitory/instability sequence or sequences within the coding region of an mRNA and modifying 
the gene encoding that mRNA to remove these inhibitory/instability sequences by making clustered nucleotide substitutions with- 
out altering the coding capacity of the gene is disclosed. Constructs containing these mutated genes and host cells containing 
these constructs are also disclosed. The method and constructs are exemplified by the mutation of a Human Immunodeficiency 
Virus- 1 Rev-dependent gag gene to a Rev-independent gag gene. Constructs useful in locating inhibitory/instability sequences 
within either the coding region or the 3' untranslated region of an mRNA are also disclosed. The exemplified constructs of the 
invention may also be useful in HIV-1 immunotherapy and immunoprophylaxis. 
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METHOD OF ELIMINATING 
INHIBITORY /INSTABILITY RKGTONS OF mRNA 

5 This application is a continuation-in-part of 

U.S. Serial No. 07/858,747, filed March 27, 1992. 

I- TECHNICAL FIELD 

The invention relates to methods of increasing 

lQ the stability and/or utilization of a mRNA produced by a 
gene by mutating regulatory or inhibitory/instability 
sequences (INS) in the coding region of the gene which 
prevent or reduce expression. The invention also relates 
to constructs, including expression vectors, containing 

15 genes mutated in accordance with these methods and host 
cells containing these constructs. 

The methods of the invention are particularly 
useful for increasing the stability and/or utilization of 
a mRNA without changing its protein coding capacity. 

2Q These methods are useful for allowing or increasing the 

expression of genes which would otherwise not be expressed 
or which would be poorly expressed because of the presence 
of INS regions in the mRNA transcript. Thus, the methods, 
constructs and host cells of the invention are useful for 

25 increasing the amount of protein produced by any gene 

which encodes an mRNA transcript which contains an INS. 

The methods, constructs and host cells of the 
invention are useful for increasing the amount of protein 
produced from genes such as those coding for growth 

3Q . factors, interferons, interleukins , the fos proto- oncogene 
protein, and HIV-l gag and env, for example. 

The invention also relates to using the 
constructs of the invention in immunotherapy and 
immunoprophylaxis , e.g., as a vaccine, or in genetic 

35 therapy after expression in humans. Such constructs can 
include or be incorporated into retroviral or other 
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expression vectors or they may also be directly injected 
into tissue cells resulting in efficient expression of the 
encoded protein or protein fragment. These constructs may 
also be used for in -vivo or in-vitro gene replacement, 
e.g., by homologous recombination with a target gene in- 
situ. 

The invention also relates to certain 
exemplified constructs which can be used to simply and 
rapidly detect and/or define the boundaries of 
inhibitory/ instability sequences in any mRNA, methods of 
using these constructs, and host cells containing these 
constructs. Once the INS regions of the mRNAs have been 
located and/or further defined, the nucleotide sequences 
encoding these INS regions can be mutated in accordance 
with the method of this invention to allow the increase in 
15 stability and/or utilization of the mRNA and, therefore, 
allow an increase in the amount of protein produced from 
expression vectors encoding the mutated mRNA. 



10 



20 



II. BACKGROUND ART 

While much work has been devoted to studying 
transcriptional regulatory mechanisms, it has become 
increasingly clear that post- transcriptional processes 
also modulate the amount and utilization of RNA produced 
from a given gene. These post -transcriptional processes 
25 include nuclear post- transcriptional processes (e.g., 
splicing, polyadenylation, and transport) as well as 
cytoplasmic RNA degradation. All these processes 
contribute to the final steady- state level of a particular 
transcript. These points of regulation create a more 
30 flexible regulatory system than any one process could 

produce alone. For example, a short-lived message is less 
abundant than a stable one, even if it is highly 
transcribed and efficiently processed. The efficient rate 
of synthesis ensures that the message reaches the 
35 cytoplasm and is translated, but the rapid rate of 
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degradation guarantees that the mRNA does not accumulate 
to too high a level. Many RNAs, for example the mRNAS for 
proto- oncogenes c-mvc and c-fos, have been studied which 
exhibit this kind of regulation in that they are expressed 
at very low levels, decay rapidly and are modulated 
5 quickly and transiently under different conditions. See . 
M. Hentze, Biochim. Biophys. Acta 1090:281-292 (1991) for 
a review. The rate of degradation of many of these mRNAs 
has been shown to be a function of the presence of one or 
more instability/ inhibitory sequences within the mRNA 
!0 itself . 

Some cellular genes which encode unstable or 
short-lived mRNAs have been shown to contain A and U-rich 
(AU-rich) INS within the 3' untranslated region (3' UTR) 
of the transcript mRNA. These cellular genes include the 
genes encoding granulocyte -monocyte colony stimulating 
factor (GM-CSF), whose AU-rich 3 'UTR sequences (containing 
8 copies of the sequence motif AUUUA) are more highly 
conserved between mice and humans than the protein 
encoding sequences themselves (93% versus 65%) (G. Shaw, 
and R. Kamen, Cell 46:659-667 (1986)) and the myc proto- 
oncogene (c-myc_) , whose untranslated regions are conserved 
throughout evolution (for example, 81% for man and mouse) 
(M. Cole and S.E. Mango, Enzyme 44:167-180 (1990)). Other 
unstable or short-lived mRNAs which have been shown to 
contain AU-rich sequences within the 3' UTR include 
interferons (alpha, beta and gamma IFNs) ; interleukins 
(IL1, IL2 and IL3) ; tumor necrosis factor (TNF) ; 
lymphotoxin (Lym) ; igGl induction factor (IgG IF) ,- 
granulocyte colony stimulating factor (G-CSF) , myb proto - 
oncogene (c-myb) ; and sis proto -oncogene (c-sis) (G. Shaw, 
and R. Kamen, Cell 46:659-667 (1986)). See also, R. 
Wisdom and W. Lee, Gen. & Devel. 5:232-243 (1991) (c- mvc) : 
A. Shyu et al., Gen. & Devel. 5:221-231 (1991) ( c -fos) : T. 
Wilson and R. Treisman, Nature 336:396-399 (1988) (c-fos); 
T. Jones and M. Cole, Mol. Cell Biol. 7:4513-4521 (1987) 
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(c-myc) ; V. Kruys et al., Proc. Natl. Acad. Sci. USA. 
89:673-677 (1992) (TNF) ; D. Koeller et al., Proc. Natl. 
Acad. Sci. USA. JB8: 7778-7782 (1991) (transferrin receptor 
(TfR) and c-fos); I. Laird-Of f ringa et al., Nucleic Acids 
Res. 19_:2387-2394 (1991) (c-mvc) : D. Wreschner and G. 
Rechavi, Eur. J. Biochem. 172:333-340 (1988) (which 
contains a survey of genes and relative stabilities) ; 
Bunnell et al., Somatic Cell and Mol. Genet. 16:151-162 

(1990) (galactosyltransferase-associated protein (GTA) , 
which contains an AU-rich 3' UTR with regions that are 98% 
similar among humans, mice and rats) ; and Caput et al. 
Proc. Natl. Acad. Sci. 83.: 1670-1674 (1986) (TNF, which 
contains a 33 nt AU-rich sequence conserved in toto in the 
murine and human TNF mRNAs) . 

Some of these cellular genes which have been 
15 shown to contain INS within the 3' UTR of their mRNA have 
also been shown to contain INS within the coding region. 
See, e.g., R. Wisdom, and W. Lee, Gen. & Devel. 5:232-243 

(1991) (c-myc) ; A. Shyu et al., Gen. & Devel. 5:221-231 
(1991) (c-fos) . 

Like the cellular mRNAs, a number of HIV-1 mRNAs 
have also been shown to contain INS within the protein 
coding regions, which in some cases coincide with areas of 
high AU- content. For example, a 218 nucleotide region 
with high AU content (61.5%) present in the RTV-l gag 
coding sequence and located at the 5' end of the gag gene 
has been implicated in the inhibition of gag expression. 
S. Schwartz et al., J. Virol. 66:150-159 (1992). Further 
experiments have indicated the presence of more than one 
INS in the gag-protease gene region of the viral genome 
30 (see below) . Regions of high AU content have been found 
in the HIV-l gag/pol and env INS regions. The AUUUA 
sequence is not present in the gag coding sequence, but it 
is present in many copies within gag/pol and env coding 
regions. S. Schwartz et al., «T. Virol. 66:150-159 (1992). 
See also, e.g., M. Emerman, Cell 57:1155-1165 (1989) (env 
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gene contains both 3' UTR and internal 

inhibitory/ instability sequences) ; C. Rosen, Proc. Natl. 
Acad. Sci., USA £5:2071-2075 (1988) (env) ; M. 
Hadzopoulou-Cladaras et al., J. Virol. 63:1265-1274 (1989) 
(env); F. Maldarelli et al., j. Virol. £5:5732-5743 (1991) 
(gag/pol); A. Cochrane et al., J. Virol. 65.-5303-5313 
(1991) (pol). F. Maldarelli et al., supra , note that the 
direct analysis of the function of INS regions in the 
context of a replication- competent , full-length HIV-i 
provirus is complicated by the fact that the intragenic 
INS are located in the coding sequences of virion 
structural proteins. They further note that changes in 
these intragenic INS sequences would in most cases affect 
protein sequences as well, which in turn could affect the 
replication of such mutants. 

The INS regions are not necessarily AU-rich. 
For example, the c-fos coding region INS is structurally 
unrelated to the AU-rich 3' UTR INS (A. Shyu et al., Gen. 
& Devel. 5_:221-23l (1991), and some parts of the env 
coding region, which appear to contain INS elements, are 
20 not AU-rich. Furthermore, some stable transcripts also 
carry the AUUUA motif in their 3' UTRs , implying either 
that this sequence alone is not sufficient to destabilize 
a transcript, or that these messages also contain a 
dominant stabilizing element (M. Cole and S.E. Mango, 
Enzyme 44:167-180 (1990)). Interestingly, elements unique 
to specific mRNAs have also been found which can stabilize 
a mRNA transcript. One example is the Rev responsive 
element, which in the presence of Rev protein promotes the 
transport, stability and utilization of a mRNA transcript 
(B. Felber et al. , Proc. Natl. Acad. Sci. USA 86:1495-1499 
(1989) ) . 

It is not yet known whether the AU sequences 
themselves, and specifically the Shaw-Kamen sequence, 
AUUUA, act as part or all of the degradation signal. Nor 
is it clear whether this is the only mechanism employed 
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for short-lived messages, or if there are different 
classes of RNAs, each with its own degradative system. 
See., M. Cole and S.E. Mango, Enzyme 44:167-180 (1990) for 
a review; see also . T. Jones and M. Cole, Mol. Cell. 
Biol. 7:4513-4521 (1987) . Mutation of the only copy of 
5 the AUUUA sequence in the c- mvc RNA INS region has no 

effect on RNA turnover, therefore the inhibitory sequence 
may be quite different from that of GM-CSF (M. Cole and 
S.E. Mango, Enzyme 44:167-180 (1990)), or else the mRNA 
instability may be due to the presence of additional INS 
10 regions within the mRNA. 

Previous workers have made mutations in genes 
encoding AU-rich inhibitory/ instability sequences within 
the 3' X3TR of their transcript mRNAs. For example, G. 
Shaw and R. Kamen, Cell 46:659-667 (1986), introduced a 51 
nucleotide AT- rich sequence from GM-CSF into the 3' UTR of 
the rabbit /3-globin gene. This insertion caused the 
otherwise stable 0-globin mRNA to become highly unstable 
in vivo , resulting in a dramatic decrease in expression of 
0-globin as compared to the wild- type control. The 
introduction of another sequence of the same length, but 
with 14 G's and C's interspersed among the sequence, into 
the same site of the 3 r UTR of the rabbit /3-globin gene 
resulted in accumulation levels which were similar to that 
of wild -type 0- glob in mRNA. This control sequence did not 
25 contain the motif AUUUA, which occurs seven times in the 

AU-rich sequence. The results suggested that the presence 
of the AU-rich sequence in the 0-globin mRNA specifically 
confers instability. 

A. Shyu et al.. Gen. & Devel. 5:221-231 (1991), 
studied the AU-rich INS in the 3' UTR of c-fos by 
disrupting all three AUUUA pentanucleotides by single U- 
to-A point mutations to preserve the AU- richness of the 
element while altering its sequence. This change in the 
sequence of the 3' UTR INS dramatically inhibited the 
ability of the mutated 3' UTR to destabilize the 0-globin 



20 



30 



35 



WO 93/20212 



PCT/US93/02908 



7 - 



10 



message when inserted into the 3' UTR of a 0-globin mRNA 
as compared to the wild-type INS . The c-fos protein- 
coding region INS (which is structurally unrelated to the 
3' UTR INS) was studied by inserting it in- frame into the 
coding region of a /3-globin and observing the effect of 
deletions on the stability of the heterologous c- fos -fl- 
globin mRNA. 

Previous workers have also made mutations in 
genes encoding inhibitory/ instability sequences within the 
coding region of their transcript mRNAs . For example, P. 
Carter -Muenchau and R. Wolf, Proc. Natl. Acad. Sci., USA, 
8£: 113 8 -1142 (1989) demonstrated the presence of a 
negative control region that lies deep in the coding 
sequence of the coli 6-phosphogluconate dehydrogenase 
(gnd) gene. The boundaries of the element were defined by 
15 the cloning of a synthetic "internal complementary 

sequence" (ICS) and observing the effect of this internal 
complementary element on gene expression when placed at 
several sites within the gnd gene. The effect of single 
and double mutations introduced into the synthetic ICS 
element by site-directed mutagenesis on regulation of 
expression of a gnd-lacZ fusion gene correlated with the 
ability of the respective mRNAs to fold into secondary 
structures that sequester the ribosome binding site. 
Thus, the gnd gene's internal regulatory element appears 
to function as a cis -acting antisense RNA. 

M. Lundigran et al. , Proc. Natl. Acad. Sci. USA 
M: 1479-1483 (1991), conducted an experiment to identify 
sequences linked to btuB that are important for its proper 
expression and transcriptional regulation in which a DNA 
fragment carrying the region from -60 to +253 (the coding 
region starts at +241) was mutagenized and then fused in 
frame to lacZ. Expression of /3-galactosidase from variant 
plasmids containing a single base change were then 
analyzed. The mutations were all GC to AT transitions, 
as expected from the mutagenesis procedures used. Among 
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other mutations, a single base substitution at +253 
resulted in greatly increased expression of the btuB-lacZ 
gene fusion under both repressing and nonrepressing 
conditions . 

R. Wisdom and W. Lee, Gen. & Devel. 5_:232-243 
(1991) , conducted an experiment which showed that mRNA 
derived from a hybrid full length c-myc gene, which 
contains a mutation in the translation initiation codon 
from ATG to ATC, is relatively stable, implying that the 
c-myc coding region inhibitory sequence functions in a 
translation dependent manner. 

R. Parker and A. Jacobs on, Proc. Natl. Acad. 
Sci. USA 87:2780-2784 (1990) demonstrated that a region of 
42 nucleotides found in the coding region of Saccharomvces 
cerevisiae MATal mRNA, which normally confers low 
stability, can be experimentally inactivated by 
introduction of a translation stop codon immediately 
upstream of this 42 nucleotide segment. The experiments 
suggest that the decay of MATal mRNA is promoted by the 
translocation of ribosomes through a specific region of 
the coding sequence. This 42 nucleotide segment has a 
high content (8 out of 14) of rare codons (where a rare 
codon is defined by its occurrence fewer than 13 times per 
1000 yeast codons (citing S. Aota et al . , Nucl. Acids. 
Res. I£:r315-r402 (1988))) that may induce slowing of 
25 translation elongation. The authors of the study, R. 

Parker and A. Jacobson, state that the concentration of 
rare codons in the sequences required for rapid decay, 
coupled with the prevalence of rare codons in unstable 
yeast mRNAs and the known ability of rare codons to induce 
translational pausing, suggests a model in which mRNA 
structural changes may be affected by the particular 
positioning of a paused ribosome. Another author stated 
that it would be revealing to find out whether (and how) a 
kinetic change in translation elongation could affect mRNA 
35 stability (M. Hentze, Bioch. Biophys. Acta 1090 :281-292 
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(1991)). R. Parker and A. Jacobson, note, however, that 
the stable PGK1 mRNA can be altered to include up to 40% 
rare codons with, at most, a 3 -fold effect on steady- state 
mRNA level and that this difference may actually be due to 
a change in transcription rates. Thus, these authors 
5 conclude, it seems unlikely that ribosome pausing per se 
is sufficient to promote rapid mRNA decay. 

None of the aforementioned references describe 
or suggest the present invention of locating 
inhibitory/ instability sequences within the coding region 
10 of an mRNA and modifying the gene encoding that mRNA to 
remove these inhibitory/ instability sequences by making 
multiple nucleotide substitutions without altering the 
coding capacity of the gene. 

15 HI. DISCLOSURE OF THE INVENTION 

The invention relates to methods of increasing 
the stability and/or utilization of a mRNA produced by a 
gene by mutating regulatory or inhibitory/ instability 
sequences (INS) in the coding region of the gene which 

20 prevent or reduce expression. The invention also relates 
to constructs, including expression vectors, containing 
genes mutated in accordance with these methods and host 
cells containing these constructs. 

As defined herein, an inhibitory/ instability 

25 sequence of a transcript is a regulatory sequence that 
resides within an mRNA transcript and is either (1) 
responsible for rapid turnover of that mRNA and can 
destabilize a second indicator/reporter mRNA when fused to 
that indicator/reporter mRNA, or is (2) responsible for 

30 underutilization of a mRNA and can cause decreased protein 
production from a second indicator/reporter mRNA when 
fused to that second indicator/reporter mRNA or (3) both 
of the above. The inhibitory/instability sequence of a 
gene is the gene sequence that encodes an 

35 inhibitory/ instability sequence of a transcript. As used 
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herein, utilization refers to the overall efficiency of 
translation of an mRNA. 

The methods of the invention are particularly 
useful for increasing the stability and/or utilization of 
a mRNA wxthout changing its protein coding capacity 

trZ-h a ' te T tiVe ° f ^-ntion in which 

the xnhibxtory/xnstabllity sequence is mutated in such a 
way that the amino acid sequence of the encoded protein is 
changed to include conservative or non- conservative amino 
ld Subst ^tions, while still retaining the function of 
the orxgxnally encoded protein, are also envisioned as 
part of the invention. 

These methods are useful for allowing or 
xncreasing the expression of genes which would otherwise 
not be expressed or which would be poorly expressed 
because of the presence of INS regions in the mRNA 
transcript. The invention provides methods of increasing 
the production of a protein encoded by a gene which 
encodes an mRNA containing an inhibitory/instability 
region by altering the portion of the nucleotide sequence 
of any gene encoding the inhibitory/instability region. 

The methods, constructs and host cells of the 
xnvention are useful for increasing the amount of protein 
produced by any gene which encodes an mRNA transcript 
whxch contains an INS. Examples of such genes include 

ZJZT* thOSS C ° ding gr ° Wth faCt ° rS ' ^erferons, 
xnterleukxns, and the fos proto- oncogene protein, as well 

as the genes coding for HIV-1 gag and env proteins. 

The method of the invention is exemplified by 
the mutational inactivation of an INS within the coding 
regxon of the HIV-i gag gene which results in increased 
gag expression, and by constructs useful for Rev- 
xndependent gag expression in human cells. This 
mutational inactivation of the inhibitory/instability 

35 ^ e thr,n nV °T S intr ° dUCi ^ Poi- mutations 

xnto the AU-rxch inhibitory sequences within the coding 
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region of the gag gene which, due to the degeneracy of 
nucleotide coding sequences, do not affect the amino acid 
sequence of the gag protein. 

The constructs of the invention are exemplified 
by vectors containing the gag env, and pol genes which 
have been mutated in accordance with the methods of this 
invention and the host cells are exemplified by human 
HLtat cells containing these vectors. 

The invention also relates to using the 
constructs of the invention in immunotherapy and 
immunoprophylaxis, e.g., as a vaccine, or in genetic 
therapy after expression in humans. Such constructs can 
include or be incorporated into retroviral vectors or 
other expression vectors or they may also be directly 
injected into tissue cells resulting in efficient 
expression of the encoded protein or protein fragment. 
These constructs may also be used for in-vivo or in-vitro 
gene replacement, e.g., by homologous recombination with a 
target gene in- situ. 

The invention also relates to certain 
exemplified constructs which can be used to simply and 
rapidly detect and/or further define the boundaries of 
inhibitory/ instability sequences in any mRNA which is 
known or suspected to contain such regions, whether the 
INS are within the coding region or in the 3'UTR or both. 
Once the INS regions of the genes have been located and/or 
further defined through the use of these vectors, the same 
vectors can be used in mutagenesis experiments to 
eliminate the identified INS without affecting the coding 
capacity of the gene, thereby allowing an increase in the 
amount of protein produced from expression vectors 
containing these mutated genes. The invention also 
relates to methods of using these constructs and to host 
cells containing these constructs. 

The constructs of the invention which can be 
used to detect instability/ inhibitory regions within an 
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mRNA are exemplified by the vectors, pl9, pl7Ml234, 
P37M1234 and p37Ml-10D, which, are set forth in Fig. i. (B) 
and Fig. 6. p37M1234 and p37Ml-l0D are the preferred 
constructs, due to the existence of a commercially 
available ELISA test which allows the simple and rapid 
detection of any changes in the amount of expression of 
the gag indicator/ reporter protein. However, any 
constructs which contain the elements depicted between the 
long terminal repeats in the afore -mentioned constructs of 
Fig. 1. (B) and Fig. 6, and which can be used to detect 
instability/ inhibitory regions within a mRNA, are also 
envisioned as part of this invention. 

The existence of inhibitory/ instability 
sequences has been known in the art, but no solution to 
the problem which allowed increased expression of the 
genes encoding the mRNAs containing these sequences within 
coding regions by making multiple nucleotide 
substitutions, without altering the coding capacity of the 
gene, has heretofore been disclosed. 



IV « BRIEF D ESCRIPTION OF THE DRAWINGS 

Fig. 1. (A) Structure of the HIV-1 genome. Boxes indicate 
the different viral genes. (B) Structure of the gag 
expression plasmids (see infra ) . Plasmid pl7 contains the 
complete HIV-l 5' LTR and sequences up to the BssHII 
25 restriction site at nucleotide (nt) 257. (The nucleotide 
numbering refers to the revised nucleotide sequence of the 
HIV-1 molecular clone pHXB2 (G. Myers et al., Eds. Human 
retroviruses and atps . A c ompilation and analysis of 
nucleic acid and amino acid fteqii«»nre« (Los Alamos National 
Laboratory, Los Alamos, New Mexico, 1991) , incorporated 
herein by reference) . This sequence is followed by the 
pl7 8,ff coding sequence spanning nt 336-731 (represented as 
an open box) immediately followed by a translational stop 
codon and a linker sequence. Adjacent to the linker is 
the HIV-1 3' LTR from nt 8561 to the last nucleotide of 
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the U5 region. Plasmid pl7R contains in addition the 33 0 
nt Styl fragment encompassing the RRE (L. Solomin et al., 
J Virol 64:6010-6017 (1990)) (represented as a stippled 
box) 3' to the pl7 ea e coding sequence. The RRE is followed 
by HIV-1 sequences from nt 8021 to the last nucleotide of 
the U5 region of the 3' LTR. Plasmids pl9 and pl9R were 
generated by replacing the HIV-l pi7** coding sequence in 
plasmids pl7 and pl7R, respectively, with the RSV pl9^ 
coding sequence (represented as a black box) . Plasmid 
P17M1234 is identical to pl7, except for the presence of 
28 silent nucleotide substitutions within the gag coding 
region, indicated by XXX. Wavy lines represent plasmid 
sequences. Plasmid pl7M1234 (731-1424) and plasmid 
P37M1234 are described immediately below and in the 
description. These vectors are illustrative of constructs 
which can be used to determine whether a particular 
nucleotide sequence encodes an INS . In this instance, 
vector pl7Ml234, which contains an indicator gene (here, 
p 17 sag) represents the control vector and vectors 
P17M1234 (731-1424) and p37M1234 represent vectors in which 
the nucleotide sequence of interest (here the P24** coding 
region) is inserted into the vector either 3' to the stop 
codon of the indicator gene or is fused in frame to the 
coding region of the indicator gene, respectively. (C) 
Construction of expression vectors for identification of 
gag INS and for further mutagenesis. pi7Ml234 was used as 
a vector to insert additional HIV-l gag sequences 
downstream from the coding region of the altered pl7** 
gene. Three different fragments indicated by nucleotide 
numbers were inserted into vector pl7Mi234 as described 
below. To generate plasmids pl7Ml234 (731- 1081) , 
P17M1234 (731-1424) and pl7M1234 (731-2165) , the indicated 
fragments were inserted 3' to the stop codon of the pi7^ 
coding sequence in pl7M1234. In expression assays (data 
not shown), .pl7M1234 (731-1081) and pl7M1234 (731-1424) 
expressed high levels of pl7^ protein. In contrast, 
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p!7M1234 (731-2165) did not express pl7 8a8 protein, 
indicating the presence of additional INS within the HIV-1 
gag coding region. To generate plasmids pl7M1234 (731- 
1081) NS, p37M1234 and p55M1234, the stop codon at the end 
of the altered pl7 eag gene and all linker sequences in 
P17M1234 were eliminated by oligonucleotide- directed 
mutagenesis and the resulting plasmids restored the gag 
open reading frame as in HIV-1. In expression assays 
(data not shown) p37M1234 expressed high levels of protein 
as determined by western blotting and ELISA assays whereas 
P55M1234 did not express any detectable gag protein. 
Thus, the addition of sequences 3' to the p24 region 
resulted in the elimination of protein expression, 
indicating that nucleotide sequence 1424-2165 contains an 
INS. This experiment demonstrated that p37M1234 is an 
15 appropriate vector to analyze additional INS . 

Pig. 2. Gag expression from the different vectors. (A) 

HLtat cells were transfected with plasmid pl7, pl7R, or 
P17M1234 in the absence (-) or presence (+) of Rev ( see 

20 infra) . The transfected cells were analyzed by 

immunoblotting using a human HIV-1 patient serum. (B) 
Plasmid pl9 or pl9R was transfected into HLtat cells in 
the absence (-) or presence (+) of Rev. The transfected 
cells were analyzed by immunoblotting using rabbit and 

25 anti-RSV pig 81 * serum. HIV or RSV proteins served as 

markers in the same gels. The positions of pl7 gae and pl9 gag 
are indicated at right. 
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Fig. 3. niRNA analysis on northern blots. (A) HLtat cells 
were transfected with the indicated plasmids in the 
absence (-) or presence (+) of Rev. 20 fig ot total RNA 
prepared from the transfected cells were analyzed ( see 
infra) . (B) RNA production from plasmid pl9 or pl9R was 
similarly analyzed in the absence (-) or presence (+) of 
Rev. 
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Fig. 4. Nucleotide sequence of the HIV-1 pl7 sa8 region. 

The locations of the 4 oligonucleotides (M1-M4) used to 
generate all mutants are underlined. The silent 
nucleotide substitutions introduced by each mutagenesis 
oligonucleotide are indicated below the coding sequence. 
Numbering starts from nt +1 of the viral mRNA. 

Pig. 5. Gag expression by different mutants. HLtat cells 
were trans feet ed with the various plasmids indicated at 
the top of the figure. Plasmid pi7R was transfected in 
the absence (-) or presence (+) of Rev, while the other 
plasmids were analyzed in the absence of Rev. pi7s«e 
production was assayed by immunoblotting as described in 
Fig. 2. 

15 pig. g. Expression vectors used in the identification 
and elimination of additional INS elements in the gag 
region. The gag and pol region nucleotides included in 
each vector are indicated by lines. The position of some 
gag and pol oligonucleotides is indicated at the top of 
the figure, as are the coding regions for pl7 8ag , p24 8,lg , 
pl5 8,g , protease and pSe" 0 ' proteins. Vector p37M1234 was 
further mutagenized using different combinations of 
oligonucleotides. One obtained mutant gave high levels of 
p24 after expression. It was analyzed by sequencing and 
found to contain four mutant oligonucleotides M6gag, 
M7gag, M8gag and MlOgag. Other mutants containing 
different combinations of oligos did not show an increase 
in expression, or only partial increase in expression. 
P55BM1-10 and p55AMl-l0 were derived from p37Ml-iOD. 
P55M1-13P0 contains additional mutations in the gag and 
pol regions included in the oligonucleotides Mllgag, 
Ml2gag, Mi3gag and MOpol. The hatched boxes indicate the 
location of the mutant oligonucleotides; the hatched boxes 
containing circles indicate mutated regions containing 
ATTTA sequences, which may contribute to instability 
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and/or inhibition of the mRNA; and the open boxes 
containing triangles indicate mutated regions containing 
AATAAA sequences, which may contribute to instability 
and/or inhibition of the mRNA. Typical levels of p24 gae 
expression in human cells after trans fections as described 
5 supra are shown at the right (in pg/ml) . 

Fi 9« 7 « Eufcaryotic expression plasmids used to study env 
expression. The different expression plasmids are derived 
from pNL15E (Schwartz, et al. J. Virol. 64:5448-5456 
10 (1990) . The generation of the different constructs is 
described in the text. The numbering follows the 
corrected HXB2 sequence (Myers et al., 1991, supra : Ratner 
et al., Hamatol. Bluttransf us . 31x404-406 (1987); Ratner 
et al., AIDS Res. Hum. Retroviruses 3:57-69 (1987); 
Solomin, et al. <J. Virol. 64:6010-6017 (1990), starting 
with the first nucleotide of R as +1. 5'SS, 5' splice 
site; 3'SS, 3' splice site. 
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Pig. 8. Env expression is Rev dependent in the absence of 
functional splice sites. Plasmids plSESD- and plSEDSS (C) 
were transfected in the absence or presence of a rev 
expression plasmid (pL3crev) into HLtat cells. One day 
later, the cells were harvested for analyses of RNA and 
protein. Total RNA was extracted and analyzed on Northern 
blots (B) . The blots were hybridized with a 
nick- translated probe spanning XhoI-SacI (nt 8443 to 9118) 
of HXB2. Protein production was measured by western blots 
to detect cell - associated Env using a mixture of HIV-l 
patient sera and rabbit anti-gpi20 antibody (A) . 

Fig. 9. Env production from the gpl20 expression plasmids. 

The indicated plasmids were transfected into HLtat cells 
in duplicate plates. A rev expression plasmid (pL3srev) 
was cotransfected as indicated. One day later, the cells 
were harvested for analyses of RNA and protein. Total RNA 
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was extracted and analyzed on Northern blots (A) . The 
blots were hybridized using a nick- translated probe 
spanning nt 6158 to 7924. Protein production <B) was 
measured by immunoprecipitation after labeling for 5 h 
with 200 mCi/ml of 35 S- cysteine to detect secreted 
processed Env (gpi20) . 

Fig. 10. The identification of INS elements within gpl20 
and gp41 using the pl9 (RSV gag) test system. Schematic 
structure of exon 5E containing the env ORF. Different 
fragments (A to G) of the gp41 portion and fragment H of 
the vpu/gpi20 portion were PCR amplified and inserted into 
the unique EcoRI site located downstream of the RSV gag 
gene in pi9 . The location of the sequences included in 
the amplified fragments is indicated to the right using 
l:> HXB2R numbering system. Fragments A and B are amplified 
from pNL15E and pNLlSEDSS (in which the splice acceptor 
sites 7A, 7B and 7 have been deleted) respectively, using 
the same oligonucleotide primers. They are 276 and 234 
nucleotides long, respectively. Fragment C was amplified 
from pNLlSEDSS as a 323 nucleotide fragment. Fragment F 
is a Hpal-Kpnl restriction fragment of 362 nucleotides. 
Fragment E was amplified as a 668 nucleotide fragment from 
pNLlSEDSS, therefore the major splice donor at nucleotide 
5592 of HXB2 has been deleted. The rest of the fragments 
were amplified from pNL15E as indicated in the figure. 
HLtat cells were transfected with these constructs. One 
day later, the cells were harvested and p!9gag production 
was determined by Western blot analysis using the 
anti-RSVGag antibody. The expression of Gag from these 
plasmids was compared to Gag production of pi9 . SA, splice 
acceptor; B, BamHI; h, Hpal; x, Xhol; K, Kpnl . The down 
regulatory effect of INS contained within the different 
fragments is indicated at right. 
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Fig. 11. The identification of INS elements within gpl20 
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and gp41 using the p37Ml-10D (mutant INS p37^ expression 
system) test system. Schematic structure of the env ORF. 

Different fragments (1 to 7) of env were PCR amplified as 
indicated in the figure and inserted into the polylinker 
located downstream of the p37 mutant gag gene in 
P37M1-10D. Fragments l to 6 were amplified from the 
molecular clone pLW2.4, a gift of Dr. M. Reitz, which is 
very similar to HXB2R. Clone pLW2.4 was derived from an 
individual infected by the same HIV-l strain IIIB, from 
which the HXB2R molecular clone has been derived. 
Fragment 7 was cloned from pNL43 . For consistency and 
clarity, the numbering follows the HXB2R system. HLtat 
cells were transfected with these constructs. One day 
later, the cells were harvested and p24 Eae production was 
determined by antigen capture assay. The expression of 
Gag from these plasmids was compared to Gag production of 
P37M1-10D. The down regulatory effect of each fragment is 
indicated at right. 

Fig. 12. Elimination of the negative effects of CRS in 
the pol region. Nucleotides 3700-4194 of HIV-l were 
inserted in vector p37Ml234 as indicated. This resulted 
in the inhibition of gag expression. Using mutant 
oligonucleotides M9pol-M12pol (P9-P12) , several mutated 
CRS clones were isolated and characterized. One of them, 
p37M1234RCRSP10+Pl2p contains the mutations indicated in 
Fig. 13. This clone produced high levels of gag. 
Therefore, the combination of mutations in 
p37M1234RCRSP10+P12p eliminated the INS, while mutations 
only in the region of Pio or of P12 did not eliminate the 
INS . 

Fig. 13. Point mutations eliminating the negative effects 
of CRS in the pol region (nucleotides 3700-4194) . The 

combination of mutations able to completely inactivate the 
inhibitory/instability element within the CRS region of 
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HIV-1 pol (nucleotides 3 700-4194) is shown under the 
sequence in small letters. These mutations are contained 
within oligonucleotides MIOpol and M12pol (see Table 2). 
M12pol oligonucleotide contains additional mutations that 
were not introduced into p37M1234RCRSP10+P12p (see Fig. 
12) , as determined by DNA sequencing. 

Pig. 14. Plasmid map and nucleotide sequence of the 
efficient gag expression vector p37Ml-10D. (A) Plasmid 
map of vector P 37M1-10D. The plasmid contains a 
pBluescriptKS ( - ) backbone, human genomic sequences 
flanking the HIV-l sequences as found in pNL43 genomic 
clone, HIV-l LTRs and the p37« region (pi7 and p24) . The 
pl7 region has been mutagenized using oligonucleotides Ml 
to M4, and the p24 region has been mutagenized using 
oligonucleotides M6, M7, M8 and M10 , as described in the 
test. The coding region for p37 is flanked by the 5' and 
3 HIV-l LTRs , which provide promoter and polyadenylation 
signals, as indicated by the arrows. Three consecutive 
arrows indicate the U5, R, and U3 regions of the LTR, 
respectively. The transcribed portions of the LTRs are 
shown in black.. The translational stop codon inserted at 
the end of the p24 coding region is indicated at position 
1818. Some restriction endonuclease cleavage sites are 
^ also indicated. (B-D) Complete nucleotide sequence of 
P37M1-10D. The amino acid sequence of- the p37^ protein 
is shown under the coding region. Symbols are as above. 
Numbering starts at the first nucleotide of the 5' LTR. 
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V * MODES FOR CARRYTT JG OPT THE INVEMTTnw 

It is to be understood that both the foregoing 
general description and the following detailed description 
are exemplary and explanatory only, and are not 
restrictive of the invention, as claimed. The 
accompanying drawings, which are incorporated in and 
constitute a part of the specification, illustrate an 
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embodiment of the invention and, together with the 
description, serve to explain the principles of the 
invention. 

The invention comprises methods for eliminating 
intragenic inhibitory/ instability regions of an mRNA by 
(a) identifying the intragenic inhibitory/ instability 
regions, and (b) mutating the intragenic 
inhibitory/instability regions by making multiple point 
mutations. These mutations may be clustered. This method 
does not require the identification of the exact location 
or knowledge of the mechanism of function of the INS. 
Nonetheless, the results set forth herein allow the 
conclusion that multiple regions within mRNAs participate 
in determining stability and utilization and that many of 
these elements act at the level of RNA transport, 
15 turnover, and/or localization. Generally, the mutations 
are such that the amino acid sequence encoded by the mRNA 
is unchanged, although conservative and non- conservative 
amino acid substitutions are also envisioned as part of 
the invention where the protein encoded by the mutated 
gene is substantially similar to the protein encoded by 
the non -mutated gene. 

The nucleotides to be altered can be chosen 
randomly, the only requirement being that the amino acid 
sequence encoded by the protein remain unchanged; or, if 
25 conservative and non- conservative amino acid substitutions 
are to be made, the only requirement is that the protein 
encoded by the mutated gene be substantially similar to 
the protein encoded by the non-mutated gene. 

If the INS region is AT rich or GC rich, it is 
preferable that it be altered so that it has a content of 
about 50% G and C and about 50% A and T. If the INS 
region contains less -preferred codons, it is preferable 
that those be altered to more -preferred codons. If 
desired, however (e.g., to make an A and T rich region 
more G and C rich) , more -preferred codons can be altered 
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to less -preferred codons. If the INS region contains 
conserved nucleotides, some of those conserved nucleotides 
could be altered to non- conserved nucleotides. Again, the 
only requirement is that the amino acid sequence encoded 
by the protein remain unchanged; or, if conservative and 
non- conservative amino acid substitutions are to be made, 
the only requirement is that the protein encoded by the 
mutated gene be substantially similar to the protein 
encoded by the non-mutated gene. 

As used herein, conserved nucleotides means 
evolutionarily conserved nucleotides for a given gene, 
since this conservation may reflect the fact that they are 
part of a signal involved in the inhibitory/ instability 
determination. Conserved nucleotides can generally be 
determined from published references about the gene of 
interest or can be determined by using a variety of 
computer programs available to practitioners of the art. 

Less -preferred and more -preferred codons for 
various organisms can be determined from codon usage 
charts, such as those set forth in T. Maruyama et al., 
Nucl. Acids Res. 14:rl51-rlS7 (1986) and in S . Aota et 
al., Nucl. Acids. Res. 16:r315-r402 (1988), or through use 
of a computer program, such as that disclosed in U.S. 
Patent No. 5,082,767 entitled "Codon Pair Utilization", 
issued to G. W. Hatfield et al. on January 21, 1992, which 
is incorporated herein by reference. 

Generally, the method of the invention is 
carried out as follows: 

1 ' Identification of an mK N A cont-.ai.nina an INS 
The rate at which a particular protein is made 
is usually proportional to the cytoplasmic level of the 
mRNA which encodes it. Thus, a candidate for an mRNA 
containing an inhibitory/ instability sequence is one whose 
mRNA or protein is either not detectably expressed or is 
expressed poorly as compared to the level of expression of 
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a reference mRNA or protein under the control of the same 
or similar strength promoter. Differences in the steady 
state levels of a particular mRNA (as determined, for 
example, by Northern blotting) , when compared to the 
steady state level of mRNA from another gene under the 
control of the same or similar strength promoter, which 
cannot be accounted for by changes in the apparent rate of 
transcription (as determined, for example, by nuclear run- 
on assays) indicate that the gene is a candidate for an 
unstable mRNA. In addition or as an alternative to being 
unstable, cytoplasmic mRNAs may be poorly utilized due to 
various inhibitory mechanisms acting in the cytoplasm. 
These effects may be mediated by specific mRNA sequences 
which are named herein as "inhibitory sequences". 

Candidate mRNAs containing 
inhibitory/instability regions include mRNAs from genes 
whose expression is tightly regulated, e.g., many 
oncogenes, growth factor genes and genes for biological 
response modifiers such as interleukins . Many of these 
genes are expressed at very low levels, decay rapidly and 
are modulated quickly and transiently under different 
conditions. The negative regulation of expression at the 
level of mRNA stability and utilization has been 
documented in several cases and has been proposed to be 
occurring in many other cases. Examples of genes for 
which there is evidence for post- transcriptional 
regulation due to the presence of inhibitory/instability 
regions in the mRNA include the cellular genes encoding 
granulocyte -monocyte colony stimulating factor (GM-CSF) , 
proto- oncogenes c-myc., c-myb_, c-sis, c-fos; interferons 
(alpha, beta and gamma IFNs) ; interleukins (IL1, IL2 and 
IL3); tumor necrosis factor (TNF) ; lymphotoxin (Lym) ; IgGl 
induction factor (igG IF) ; granulocyte colony stimulating 
factor (G-CSF) ; transferrin receptor (TfR) ; and 
galactosyltransf erase- associated protein (GTA) ,- HIV-i 
genes encoding env, gag and pol; the coli genes for 6- 
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phosphogluconate dehydrogenase (gnd) and btuB; and the 
yeast gene for MATal (see the di'scussion in the 
"Background Art" section, above). The genes encoding the 
cellular proto- oncogenes c-myc and c-fos. as well as the 
yeast gene for MATal and the HIV-i genes for gag, env and 
pol are genes for which there is evidence for 
inhibitory/instability regions within the coding region in 
addition to evidence for inhibitory/ instability regions 
within the non- coding region. Genes encoding or suspected 
of encoding mRNAs containing inhibitory/instability 
regions within the coding region are particularly relevant 
to the invention. 

After identifying a candidate unstable or poorly 
utilized mRNA, the in vivo half -life (or stability) of 
that mRNA can be studied by conducting pulse- chase 
experiments (i.e., labeling newly synthesized RNAs with a 
radioactive precursor and monitoring the decay of the 
radiolabeled mRNA in the absence of label) ; or by 
introducing in vitro transcribed mRNA into target cells 
(either by microinjection, calcium phosphate co- 
precipitation, electroporation, or other methods known in 
the art) to monitor the in vivo half -life of the defined 
mRNA population; or by expressing the mRNA under study 
from a promoter which can be induced and which shuts off 
transcription soon after induction, and estimating the 
half -life of the mRNA which was synthesized during this 
short transcriptional burst; or by blocking transcription 
pharmacologically (e.g., with Act inomycin D) and following 
the decay of the particular mRNA at various time points 
after the addition of the drug by Northern blotting or RNA 
protection (e.g. SI nuclease) assays. Methods for all the 
above determinations are well established. Sgg, e.g., 
M.W. Hentze et al., Biochim. Biophys. Acta lfi9_Q:28l-292 
(1991) and references cited therein. See also . 
S. Schwartz et al., J. Virol. 66:150-159 (1992). The most 
useful measurement is how much protein is produced, 
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because this includes all possible INS mechanisms. 
Examples of various mRNAs which have been shown to contain 
or which are suspected to contain INS regions are 
described above. Some of these mRNAs have been shown to 
have half -lives of less than 30 minutes when their mRNA 
levels are measured by Northern blots ( see , e.g., D. 
Wreschner and G. Rechavi, Eur. J. Biochem. 172:333-340 
(1988)). 

2. Localization of Instabili t y Defcerminant-« 
When an unstable or poorly utilized mRNA has 
been identified, the next step is to search for the 
responsible (cis-acting) RNA sequence elements. Detailed 
methods for localizing the cis.- acting 

inhibitory/instability regions are set forth in each of 
the references described in the "Background Art" section, 
above, and are also discussed infra . The exemplified 
constructs of the present invention can also be used to 
localize INS (see below) . £is acting sequences 
responsible for specific mRNA turnover can be identified 
by deletion and point mutagenesis as well as by the 
occasional identification of naturally occurring mutants 
with an altered mRNA stability. 

In short, to evaluate whether putative 
regulatory sequences are sufficient to confer mRNA 
stability control, DNA sequences coding for the suspected 
INS regions are fused to an indicator (or reporter) gene 
to create a gene coding for a hybrid mRNA. The DNA 
sequences fused to the indicator (or reporter) gene can be 
cDNA, genomic DNA or synthesized DNA. Examples of 
indicator (or reporter) genes that are described in the 
references set forth in the "Background Art" section 
include the genes for neomycin, /3-galactosidase, 
chloramphenicol actetyltransf erase (CAT) , and lucif erase, 
as well as the genes for 0-globin, PGK1 and ACT1. See 
35 also Sambrook et al.. Molecular Cloning, * Labnr^orv 
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Manual , 2d. ed. Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, (1989) , pp. 16.56-16. 67. Other genes 
which can be used as indicator genes are disclosed herein 
(i.e., the gag gene of the Rous Sarcoma Virus (which lacks 
an inhibitory/instability region) and the Rev independent 
HIV-l gag genes of constructs pl7M1234, p37M1234 and 
P37M1-10D, which have been mutated to inactivate the 
inhibitory/instability region and which constitute one 
aspect of the invention. In general, virtually any gene 
encoding a mRNA which is stable or which is expressed at 
relatively high levels (defined here as being stable 
enough or expressed at high enough level so that any 
decrease in the level of the mRNA or expressed protein can 
be detected by standard methods) can be used as an 
indicator or reporter gene, although the constructs 
P37M1234 and P 37M1-10D, which are exemplified herein, are 
preferred for reasons set forth below. Preferred methods 
of creating hybrid genes using these constructs and 
testing the expression of mRNA and protein from these 
constructs are also set forth below. 

In general, the stability and/or utilization of 
the mRNAs generated by the indicator gene and the hybrid 
genes consisting of the indicator gene fused to the 
sequences suspected of encoding an INS region are tested 
by transfecting the hybrid genes into host cells which are 
appropriate for the expression vector used to clone and 
express the mRNAs. The resulting levels of mRNA are 
determined by standard methods of determining mRNA 
stability, e.g. Northern blots, Si mapping or PCR methods, 
and the resulting levels of protein produced are 
quant itated by protein measuring assays, such as ELISA, 
immunoprecipitation and/or western blots. The 
inhibitory/instability region (or regions, if there are 
more than one) will be identified by a decrease in the 
protein expression and/or stability of the hybrid mRNA as 
compared to the control indicator mRNA. Note that if the 
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ultimate goal is to increase production of the encoded 
protein, the identification of the INS is most preferably- 
carried out in the same host cell as will be used for the 
production of the protein . 

Examples of some of the host cells that have 
been used to detect INS sequences include somatic 
mammalian cells, Xenoous oocytes, yeast and E. coli . See , 
e.g., G. Shaw and R. Kamen, Cell 46:659-667 (1986) 
(discussed supra) which localized instability sequences in 
GM-CSF by inserting putative inhibitory sequences into the 
3' UTR of the 0-globin gene, causing the otherwise stable 
0-globin mRNA to become unstable when transfected into 
mouse or human cells. See also I. Laird-Of f ringa et al., 
Nucleic Acids Res. l£:2387-2394 (1991) which localized 
inhibitory/ instability sequences in c- myc using hybrid c- 
myc- neomycin resistance genes introduced into rat 
fibroblasts, and M. Lundigran et al., Proc. Natl. Acad. 
Sci. USA S8:1479-1483 (1991) which localized 
inhibitory/instability sequences in btuB gene by using 
hybrid btuB-lacZ genes introduced into E^. coli . For 
examples of reported localization of specific 
inhibitory/ instability sequences within a transcript of 
HIV-1 by destabilization of an otherwise long-lived 
indicator transcript, set, e.g., M. Emerman, Cell 57:1155- 
1165 (1989) (replaced 3' UTR of env gene with part of HBV 
and introduced into COS-l cells); S. Schwartz et al., J. 
Virol. 6£: 150 -159 (1992) (gag gene fusions with Rev 
independent tat reporter gene introduced into HeLa cells) ; 
F. Maldarelli et al., J. Virol. 65:5732-5743 (1991) 
(gag/pol gene fusions with Rev independent tat reporter 
gene or chloramphenicol acetyltransf erase (CAT) gene 
introduced into HeLa and SW480 cells) ; and A. Cochrane et 
al., J. Virol. £5:5303-5313 (1991) (pol gene fusions with 
CAT gene or rat proinsulin gene introduced into COS-l and 
CHO cells) . 

It is anticipated that in vitro mRNA degradation 
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systems (e.g., crude cytoplasmic extracts) to assay mRNA 
turnover in vitro will complement ongoing in vivo analyses 
and help to circumvent some of the limitations of the in 
yivo systems. See M.W. Hentze et al., Biochim. Biophys. 
Acta 109.0:281-292 (1991) and references cited therein. 
See also D. Wreschner and G. Rechavi, Eur. J. Biochem. 
172:333-340 (1988), which analyzed exogenous mRNA 
stability in a reticulocyte lysate cell -free system. 

In the method of the invention, the whole gene 
of interest may be fused to an indicator or reporter gene 
and tested for its effect on the resulting hybrid mRNA in 
order to determine whether that gene contains an 
inhibitory/instability region or regions. To further 
localize the INS within the gene of interest, fragments of 
the gene of interest may be prepared by sequentially 
deleting sequences from the gene of interest from either 
the 5' or 3' ends or both. The gene of interest may also 
be separated into overlapping fragments by methods known 
in the art (e.g., with restriction endonucleases , etc.) 
See, e.g., S. Schwartz et al . , J. Virol. 66:150-159 
20 (1992) . Preferably, the gene is separated into 

overlapping fragments about 300 to 2000 nucleotides in 
length. Two types of vector constructs can be made. To 
permit the detection of inhibitory/instability regions 
that do not need to be translated in order to function, 
vectors can be constructed in which the gene of interest 
(or its fragments or suspected INS) can be inserted into 
the 3' UTR downstream from the stop codon of an indicator 
or reporter gene. This does not permit translation 
through the INS. To test the possibility that some 
inhibitory/instability sequences may act only after 
translation of the mRNA, vectors can be constructed in 
which the gene of interest (or its fragments or suspected 
INS) is inserted into the coding region of the 
indicator/ reporter gene. This method will permit the 
detection of inhibitory/instability regions that do need 
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to be translated in order to function. The hybrid 
constructs are transfected into host cells, and the 
resulting mRNA levels are determined by standard methods 
of determining mRNA stability, e.g. Northern blots, SI 
mapping or PCR methods, as set forth above and as 
described in most of the references cited in the 
"Background Art" section. See also, Sambrook et al. 
(1989), supra , for experimental methods. The protein 
produced from such genes is also easily guantitated by 
existing assays, such as ELISAS, immunoprecipitation and 
western blots, which are also described in most of the 
references cited in the "Background Art" section. Sjse 
also, Sambrook et al. (1989), sjipra, for experimental 
methods. The hybrid DNAs containing the 
inhibitory/instability region (or regions, if there are 
more than one) will be identified by a decrease in the 
protein expression and/or stability of the hybrid mRNA as 
compared to the control indicator mRNA. The use of 
various fragments of the gene permits the identification 
of multiple independently functional 

inhibitory/instability regions, if any, while the use of 
overlapping fragments lessen the possibility that an 
inhibitory/instability region will not be identified as a 
result of its being cut in half, for example. 

The exemplified test vectors set forth in Fig. 
1. (B) and Fig. 6 and described herein, e.g., vectors 
P17M1234, p37M1234, P37M1-10D and pi9 , can be used to 
assay for the presence and location of INS in various 
RNAs, including INS which are located within coding 
regions. These vectors can also be used to determine 
whether a gene of interest not yet characterized has INS 
which are candidates for mutagenesis curing. These 
vectors have a particular advantage over the prior art in 
that the same vectors can be used in the mutagenesis step 
of the invention (described below) in which the identified 
INS is eliminated without affecting the coding capacity of 



20 



30 



35 



WO 93/20212 



PCT/US93/02908 



10 



- 29 - 

the gene. 

The method of using these vectors involves 
introducing the entire gene, entire cDNA or fragments of 
the gene ranging from approximately 300 nucleotides to 
approximately 2 kilobases 3' to the coding region for gag 
protein using unique restriction sites which are 
engineered into the vectors. The expression of the gag 
gene in HLtat cells is measured at both the RNA and 
protein levels, and compared to the expression of the 
starting vectors. A decrease in expression indicates the 
presence of INS candidates that may be cured by 
mutagenesis. The method of using the vectors exemplified 
in Fig. i herein involves introducing the entire gene and 
fragments of the gene of interest into vectors pl7Ml234, 
P37M1234 and pl9. The size of the fragments are 
15 preferably 300-2000 nucleotides long. Plasmid DNA is 
prepared in E. coli and purified by the CsCl method. 

To permit detection of inhibitory/instability 
regions which do not need to be translated in order to 
function, the entire gene and fragments of the gene of 
interest are introduced into vectors pl7M1234, P 37M1234 or 
pl9 3' of the stop codon of the pl7** coding region. To 
allow the detection of inhibitory/instability regions that 
affect expression only when translated, the described 
vectors can be manipulated so that the coding region of 
the entire gene or fragments of the gene of interest are 
fused in frame to the expressed gag protein gene. For 
example, a fragment containing all or part of the coding 
region of the gene of interest can be inserted exactly 3' 
to the termination codon of the gag coding sequence in 
vector p37M1234 and the termination codon of gag and the 
linker sequences can be removed by oligonucleotide 
mutagenesis in such a way as to fuse the gag reading frame 
to the reading frame of the gene of interest. 

RNA and protein production from the two 
expression vectors (e.g. p37M1234 containing the fragment 
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of the gene of interest inserted directly 3' of the stop 
codon of the gag coding region, with the gag termination 
codon intact, and p37M1234 containing the fragment of the 
gene of interest inserted in frame with the gag coding 
region, with the gag termination codon deleted) are then 
compared after transfection of purified DNA into HLtat 
cells.. 

The expression of these vectors after 
transfection into human cells is monitored at both the 
level of RNA and protein production. RNA levels are 
quantitated by, e.g.. Northern blots, SI mapping or PCR 
methods. Protein levels are quantitated by, e.g., western 
blot or ELISA methods. p37M1234 and p37Ml-lOD are ideal 
for quantitative analysis because a fast non- radioactive 
ELISA protocol can be used to detect gag protein (DUPONT 
15 or COULTER gag antigen capture assay) . A decrease in the 
level of expression of the gag antigen indicates the 
presence of inhibitory/instability regions within the 
cloned gene or fragment of the gene of interest. 

After the inhibitory/ instability regions have 
been identified, the vectors containing the appropriate 
INS fragments can be used. to prepare single -stranded DNA 
and then used in mutagenesis experiments with specific 
chemically synthesized oligonucleotides in the clustered 
mutagenesis protocol described below. 
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3. Mutation of the Inhibitory/Instability 
Regions to Generate Stable mPW&g 

Once the inhibitory/instability sequences are 
located within the coding region of an mRNA, the gene is 
modified to remove these inhibitory/ instability sequences 
without altering the coding capacity of the gene. 
Alternatively, the gene is modified to remove the 
inhibitory/instability sequences, simultaneously altering 
the coding capacity of the gene to encode either 
conservative or non- conservative amino acid substitutions. 
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In the method of the invention, the most general 
method of eliminating the INS in the coding region of the 
gene of interest is by making multiple mutations in the 
INS region of the gene or gene fragments, without changing 
the amino acid sequence of the protein encoded by the 
gene; or, if conservative and non- conservative amino acid 
substitutions are to be made, the only requirement is that 
the protein encoded by the mutated gene be substantially 
similar to the protein encoded by the non-mutated gene. 
It is preferred that all of the suspected 
inhibitory/ instability regions, if more than one, be 
mutated at once. Later, if desired, each 

inhibitory/instability region can be mutated separately in 
order to determine the smallest region of the gene that 
needs to be mutated in order to generate a stable mRNA. 
The ability to mutagenize long DNA regions at the same 
time can decrease the time and effort needed to produce- 
the desired stable and/or highly expressed mRNA and 
resulting protein. The altered gene or gene fragments 
containing these mutations will then be tested in the 
usual manner, as described above, e.g., by fusing the 
altered gene or gene fragment with a reporter or indicator 
gene and analyzing the level of mRNA and protein produced 
by the altered genes after transfection into an 
appropriate host cell. if the level of mRNA and protein 
produced by the hybrid gene containing the altered gene or 
gene fragment is about the same as that produced by the 
control construct encoding only the indicator gene, then 
the inhibitory/instability regions have been effectively 
eliminated from the gene or gene fragment due to the 
alterations made in the INS. 

In the method of the invention, more than two 
point mutations will be made in the INS region. 
Optionally, point mutations may be made in at least about 
10% of the nucleotides in the inhibitory/instability 
region. These point mutations may also be clustered. The 
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nucleotides to be altered can be chosen randomly (i.e., 
not chosen because of AT or GC content or the presence or 
absence of rare or preferred codons) , the only requirement 
being that the amino acid sequence encoded by the protein 
remain unchanged; or, if conservative and non- conservative 
amino acid substitutions are to be made, the only 
requirement is that the protein encoded by the mutated 
gene be substantially similar to the protein encoded by 
the non-mutated gene. 

In the method of the present invention, the gene 
sequence can be mutated so that the encoded protein 
remains the same due to the fact that the genetic code is 
degenerate, i.e., many of the amino acids may be encoded 
by more than one codon. The base code for serine, for 
example, is six- way degenerate such that the codons TCT, 
15 TCG, TCC, TCA, AGT, and AGC all code for serine. 

Similarly, threonine is encoded by any one of codons ACT, 
ACA, ACC and ACG. Thus, a plurality of different DNA 
sequences can be used to code for a particular set of 
amino acids. The codons encoding the other amino acids ' 
are TTT and TTC for phenylalanine; TTA, TTG, CTT, CTC, CTA 
and CTG for leucine; ATT, ATC and ATA for isoleucine; ATG 
for methione; GTT, GTC, GTA and GTG for valine; CCT, CCC, 
CCA and CCG for proline; GOT, GCC, GCA and GCG for 
alanine; TAT and TAC for tyrosine; CAT and CAC for 
histidine; CAA and CAG for glutamine; AAT and AAC for 
asparagine; AAA and AAG for lysine; GAT and GAC for 
aspartic acid; GAA and GAG for glutamic acid; TGT and TGC 
for cysteine; TGG for tryptophan; CGT, CGC, CGA and CGG 
for arginine; and GGU, GGC, GGA and GGG for glycine. 
Charts depicting the codons (i.e., the genetic code) can 
be found in various general biology or biochemistry 
textbooks . 

In the method of the present invention, if the 
portion (s) of the gene encoding the inhibitory/instability 
regions are AT- rich, it is preferred, but not believed to 
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be necessary, that most or all of the mutations in the 
inhibitory/instability region be the replacement of A and 
T with G and C nucleotides, making the regions more GC- 
rich, while still maintaining the coding capacity of the 
gene. If the portion (s) of the gene encoding the 
inhibitory/ instability regions are GC-rich, it is 
preferred, but not believed to be necessary, that most or 
all of the mutations in the inhibitory/ instability region 
be the replacement of G and C nucleotides with A and T 
nucleotides, making the regions less GC-rich, while still 
maintaining the coding capacity of the gene. If the INS 
region is either AT- rich or GC-rich, it is most preferred 
that it be altered so that it has a content of about 50% G 
and C and about 50% A and T. The AT- (or AU- ) content 
(or, alternatively, the GC- content) of an 

inhibitory/instability region or regions can be calculated 
by using a computer program designed to make such 
calculations. Examples of such programs, used to 
determine the AT- richness of the HIV-i gag 

inhibitory/instability regions exemplified herein, are the 
GCG Analysis Package for the VAX (University of Wisconsin) 
and the Gene Works Package (Intelligenetics) . 

In the method of the invention, if the INS 
region contains less -preferred codons, it is preferable 
that those be altered to more -preferred codons. If 
desired, however (e.g., to make an AT-rich region more GC- 
rich) , more -preferred codons can be altered to less- 
pref erred codons. it is also preferred, but not believed 
to be necessary, that less -preferred or rarely used codons 
be replaced with more -preferred codons. Optionally, only 
the most rarely used codons (identified from published 
codon usage tables, such as in T. Maruyama et al., Nucl. 
Acids Res. 14(Supp) :rl5l-i97 (1986)) can be replaced with 
preferred codons, or alternatively, most or all of the 
rare codons can be replaced with preferred codons. 
Generally, the choice of preferred codons to use will 
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depend on the codon usage of the host cell in which the 
altered gene is to be expressed. Note, however, that the 
substitution of more -preferred codons with less -preferred 
codons is also functional, as shown in the example below. 

As noted above, coding sequences are chosen on 
the basis of the genetic code and, preferably on the 
preferred codon usage in the host cell or organism in 
which the mutated gene of this invention is to be 
expressed. In a number of cases the preferred codon usage 
of a particular host or expression system can be 
ascertained from available references ( see , e.g., T. 
Maruyama et al . , Nucl. Acids Res. 14 (Supp) :r!51-197 
(1986) ) , or can be ascertained by other methods ( see , 
e.g., U.S. Patent No. 5,082,767 entitled "Codon Pair 
Utilization", issued to G. W. Hatfield et al. on January 
15 21 * 1992, which is incorporated herein by reference) . 
Preferably, sequences will be chosen to optimize 
transcription and translation as well as mRNA stability so 
as to ultimately increase the amount of protein produced. 
Selection of codons is thus, for example, guided by the 
preferred use of codons by the host cell and/or the need 
to provide for desired restriction endonuclease sites and 
could also be guided by a desire to avoid potential 
secondary structure constraints in the encoded mRNA 
transcript. Potential secondary structure constraints can 
be identified by the use of computer programs such as the 
one described in M. Zucker et al., Nucl. Acids Res. 9:133 
(1981) . More than one coding sequence may be chosen in 
situations where the codon preference is unknown or 
ambiguous for optimum codon usage in the chosen host cell 
or organism. However, any correct set of codons would 
encode the desired protein, even if translated with less 
than optimum efficiency. 

In the method of the invention, if the INS 
region contains conserved nucleotides, it is also 
preferred, but not believed to be necessary, that 
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conserved nucleotides sequences in the 

inhibitory/ instability region be mutated. Optionally, at 
least approximately 75% of the mutations made in the 
inhibitory/instability region may involve the mutation of 
conserved nucleotides. Conserved nucleotides can be 
determined by using a variety of computer programs 
available to practitioners of the art. 

In the method of the invention, it is also 
anticipated that inhibitory/ instability sequences can be 
mutated such that the encoded amino acids are changed to 
contain one or more conservative or non- conservative amino 
acids yet still provide for a functionally equivalent 
protein. For example, one or more amino acid residues 
within the sequence can be substituted by another amino 
acid of a similar polarity which acts as a functional 
equivalent, resulting in a neutral substitution in the 
amino acid sequence. Substitutes for an amino acid within 
the sequence may be selected from other members of the 
class to which the amino acid belongs. For example, the 
nonpolar (hydrophobic) amino acids include alanine, 
leucine, isoleucine, valine, proline, phenylalanine, 
tryptophan and methionine. The polar neutral amino acids 
include glycine, serine, threonine, cysteine, tyrosine, 
asparagine, and glutamine. The positively charged (basic) 
amino acids include arginine, lysine and histidine. The 
negatively charged (acidic) amino acids include aspartic 
acid and glutamic acid. 

In the exemplified method of the present' 
invention, all of the regions in the HIV-i gag gene 
suspected to have inhibitory/instability activity were 
first mutated at once over a region approximately 270 
nucleotides in length using clustered site-directed 
mutagenesis with four different oligonucleotides spanning 
a region of approximately 300 nucleotides to generate the 
construct pl7M1234, described infra, which encodes a 
stable mRNA. 
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The four oligonucleotides, which are depicted in 
Fig. 4, are 

Ml : ccagggggaaagaagaagtacaagctaaagcacatcgtatgggcaagcagg 
(SEQ ID NO: 6) ; M2 : 

ccttcagacaggatcagaggagcttcgatcactatacaacacagtagc (SEQ ID 
5 NO: 7); M3 : 

accctctattgtgtgcaccagcggatcgagatcaaggacaccaaggaagc (SEQ ID 
NO: 8) ; and M4: 

gagcaaaacaagtccaagaagaaggcccagcaggcagcagctgacacagg (SEQ ID 
NO: 9). These oligonucleotides are 51 (Ml), 48 (M2) , 50 
10 (M3) and 50 (M4) nucleotides in length. Each 

oligonucleotide introduced several point mutations over an 
area of 19-22 nucleotides (see infra ) . The number of 
nucleotides 5' to the first mutated nucleotide were 14 
(Ml); 18 (M2) ; 17 (M3 ) ; and 11 (M4) ; and the number of 
15 nucleotides 3' to the last mutated nucleotide were 15 

(Ml); 8 (M2); 14 (M3); and 17 (M4) . The ratios of AT to 
GC nucleotides present in each of these regions before 
mutation was 33AT/18GC (Ml) ; 30AT/18GC (M2) ; 29AT/21GC 
(M3) and 27AT/23GC (M4) . The ratios of AT to GC 
nucleotides present in each of these regions after 
mutation was 25AT/26GC (Ml); 24AT/24GC (M2) ; 23AT/27GC 
(M3) and 22AT/28GC (M4) . A total of 26 codons were 
changed. The number of times the codon appears in human 
genes per 1000 codons (from T. Maruyama et al., Nuc. Acids 
Res. 14 (Supp.) :rl51-rl97 (1986)) is listed in parentheses 
next to the codon. In the example, 8 codons encoding 
lysine (Lys) were changed from aaa (22.0) to aag (35.8); 
two codons encoding tyrosine (Tyr) were changed from tat 
(12.4) to tac (18.4); two codons encoding leucine (Leu) 
were changed from tta (5.9) to eta (6.1); two codons 
encoding histidine (His) were changed from cat (9.8) to 
cac (14.3); three codons encoding isoleucine (lie) were 
changed from ata (5.1) to ate (24.0); two codons encoding 
glutamic acid (Glu) were changed from gaa (26.8) to gag 
(41.6); one codon encoding arginine (Arg) was changed from 
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aga (10.8) to cga (5.2) and one codon encoding arginine 
(Arg) was changed from agg (11.4) to egg (7.7); one codon 
encoding asparagine (Asn) was changed from aat (16.9) to 
aac (23.6); two codons encoding glutamine (Gin) were 
changed from caa (11.5) to cag (32.7); one codon encoding 
serine (Ser) was changed from agt (8.7) to tec (18.7); and 
one codon encoding alanine (Ala) was changed from gca 
(12.7) to gec (29.8) . 

The techniques of oligonucleotide- directed site- 
specific mutagenesis employed to effect the modifications 
in structure or sequence of the DNA molecule are known to 
those of skill in the art. The target DNA sequences which 
are to be mutagenized can be cDNA, genomic DNA or 
synthesized DNA sequences. Generally, these DNA sequences 
are cloned into an appropriate vector, e.g., a 
bacteriophage M13 vector, and single- stranded template DNA 
is prepared from a plaque generated by the recombinant 
bacteriophage. The single- stranded DNA is annealed to the 
synthetic oligonucleotides and the mutagenesis and 
subsequent steps are performed by methods well known in 
20 the art. See, e.g., M. Smith and S. Gillam, in Genetic 
Engineering: Principles an d Methnrig , Plenum Press 3:1-32 
(1981) (review) and T. Kunkel, Proc. Natl. Acad. Sci. USA 
82:488-492 (1985). See also . Sambrook et al . (1989), 
supra. The synthetic oligonucleotides can be synthesized 
25 on a DNA synthesizer (e.g., Applied Biosystems) and 

purified by electrophoresis by methods known in the art. 
The length of the selected or prepared 

oligodeoxynucleo.tides using this method can vary. There 
are no absolute size limits. As a matter of convenience, 
for use in the process of this invention, the shortest 
length of the oligodeoxynucleotide is generally 
approximately 20 nucleotides and the longest length is 
generally approximately 60 to 100 nucleotides. The size 
of the oligonucleotide primers are determined by the 
requirement for stable hybridization of the primers to the 
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regions of the gene in which the mutations are to be 
induced, and by the limitations of the currently available 
methods for synthesizing oligonucleotides. The factors to 
be considered in designing oligonucleotides for use in 
oligonucleotide -directed mutagenesis (e.g., overall size, 
size of portions flanking the mutation (s) ) are described 
by M. Smith and S. Gillam in Genetic Engineering: 
Principles and Methods . Plenum Press 3:1-32 (1981) . In 
general, the overall length of the oligonucleotide will be 
such as to optimize stable, unique hybridization at the 
mutation site with the 5' and 3' extensions from the 
mutation site being of sufficient size to avoid editing of 
the mutation (s) by the exonuclease activity of the DNA 
polymerase. Oligonucleotides used for mutagenesis in the 
present invention will generally be at least about 20 
!5 nucleotides, usually about 40 to 60 nucleotides in length 
and usually will not exceed about 100 nucleotides in 
length. The oligonucleotides will usually contain at 
least about five bases 3* of the altered codons. 

In the preferred mutagenesis protocol of the 
present invention, the INS containing expression vectors 
contain the BIiUESCEIPT plasmid vector as a backbone. This 
enables the preparation of double- stranded as well as 
single- stranded DNA. Single- stranded uracil containing 
DNA is prepared according to a standard protocol as 
25 follows: The plasmid is transformed into a F' bacterial 
strain (e.g.. DH5aF')- A colony is grown and infected 
with the helper phage M13-VCS [Stratagene #20025; lxlO 11 
pfu/ml] . This phage is used to infect a culture of the E. 
coli strain CJ236 and single- stranded DNA is isolated 
according to standard methods. 0.25 ug of single- stranded 
DNA is annealed with the synthesized oligonucleotides (5 
ul of each oligo, dissolved at a concentration of 5 
OD 260 /ml. The synthesized oligonucleotides are usually 
about 40 to 60 nucleotides in length and are designed to 
contain a perfect match of approximately 10 nucleotides at 
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each end. They may contain as many changes as desired 
within the remaining 20-40 nucleotides. The 
oligonucleotides are designed to cover the region of 
interest and they may be next to each other or there may 
be gaps between them. Up to six different 

oligonucleotides have been used at the same time, although 
it is. believed that the use of more than six 
oligonucleotides at the same time would also work in the 
method of this invention. After annealing, elongation 
with T4 polymerase produces the second strand which does 
not contain uracil. The free ends are ligated using 
ligase. This results in double -stranded DNA which can be 
used to transform L coli strain HB101. The mutated 
strand which does not contain uracil produces double - 
stranded DNA, which contains the introduced mutations. 
Individual colonies are picked and the mutations are 
quickly verified by sequence analysis . Alternatively or 
additionally, this mutagenesis method can (and has been) 
used to select for different combinations of 
oligonucleotides which result in different mutant 
phenotypes. This facilitates the analysis of the regions 
important for function and is helpful in subsequent 
experiments because it allows the analysis of exact 
sequences involved in the INS. In addition to the 
exemplified mutagenesis of the INS-1 region of HIV-i 
described herein, this method has also been used to mutate 
in one step a region of 150 nucleotides using three 
tandemly arranged oligonucleotides that introduced a total 
of 35 mutations. The upper limit of changes is not clear, 
but it is estimated that regions of approximately 500 
nucleotides can be changed in 20% of their nucleotides in 
one step using this protocol. 

The exemplified method of mutating by using 
oligonucleotide-directed site-specific mutagenesis may be 
varied by using other methods known in the art. For 
example, the mutated gene can be synthesized directly 
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using overlapping synthetic deoxynucleotides ( see , e.g., 
Edge et al. , Nature 292:756 (1981); Nambair et al., 
Science 223:1299 (1984); Jay et al . , J. Biol. Chem. 
259:6311 (1984); or by using a combination of polymerase 
chain reaction generated DNAs or cDNAs and synthesized 
oligonucleotides . 

4. Determination of Stability of the 
Mutated mRNA 

The steady state level and/or stability of the 
resultant mutated mRNAs can be tested in the same manner 
as the steady state level and/or stability of the 
unmodified mRNA containing the inhibitory/ instability 
regions are tested (e.g., by Northern blotting), as 
discussed in section 1, above. The mutated mRNA can be 
analyzed along with (and thus compared to) the unmodified 
mRNA containing the inhibitory/ instability region (s) and 
with an unmodified indicator mRNA, if desired. As 
exemplified, the HIV-i pi7 B « mutants are compared to the 
unmutated HIV-l pi7^ in transfection experiments' by 
subsequent analysis of the mRNAs by Northern blot 
analysis. The proteins produced by these mRNAs are 
measured by immunoblotting and other methods known in the 
art, such as ELISA. See infra . 

VI. INDUSTRIAL APPLICABILITY 

Genes which can be mutated by the methods of 
this invention include those whose mRNAs are known or 
suspected of containing INS regions in their mRNAs. These 
genes include, for example, those coding for growth 
factors, interferons, interleukins , the fos proto- oncogene 
protein, and HIV-l gag, env and pol, as well as other 
viral mRNAs in addition to those exemplified herein. 
Genes mutated by the methods of this invention can be 
expressed in the native host cell or organism or in a 
different cell or organism. The mutated genes can be 
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introduced into a vector such as a plasmid, cosmid, phage, 
virus or mini -chromosome and inserted into a host cell or 
organism by methods well known in the art. In general, 
the mutated genes or constructs containing these mutated 
genes can be utilized in any cell, either eukaryotic or 
5 prokaryotic, including mammalian cells (e.g., human (e.g., 
HeLa) , monkey (e.g., Cos), rabbit (e.g., rabbit 
reticulocytes), rat, hamster (e.g., CHO and baby hamster 
kidney cells) or mouse cells (e.g., L cells), plant cells, 
yeast cells, insect cells or bacterial cells (e.g., 
10 coli) . The vectors which can be utilized to clone and/or 
express these mutated genes are the vectors which are 
capable of replicating and/or expressing the mutated genes 
in the host cell in which the mutated genes are desired to 
be replicated and/or expressed. See, e.g., F. Ausubel et 
al " Current Protocols in Molecular Biol nay . Greene 
Publishing Associates and Wiley- Interscience (1992) and 
Sambrook et al. (1989) for examples of appropriate vectors 
for various types of host cells. The native promoters for 
such genes can be replaced with strong promoters 
compatible with the host into which the gene is inserted. 
These promoters may be inducible. The host cells 
containing these mutated genes can be used to express 
large amounts of the protein useful in enzyme 
preparations, pharmaceuticals, diagnostic reagents, 
vaccines and therapeutics. 

Genes altered by the methods of the invention or 
constructs containing said genes may also be used for in- 
vivo or in-vitro gene replacement. For example, a gene 
which produces an mRNA with an inhibitory instability 
region can be replaced with a gene that has been modified 
by the method of the invention in situ to ultimately 
increase the amount of protein expressed. Such gene 
include viral genes and/or cellular genes. Such gene 
replacement might be useful, for example, in the 
development of a vaccine and/or genetic therapy. 
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The constructs and/or proteins made by using 
constructs encoding the exemplified altered gag, env, and 
pol genes could be used, for example, in the production of 
diagnostic reagents, vaccines and therapies for AIDS and 
AIDS related diseases. The inhibitory/instability 
elements in the exemplified HIV-l gag gene may be involved 
in the establishment of a state of low virus production in 
the host. HIV-l and the other lentiviruses cause chronic 
active infections that are not cleared by the immune 
system. It is possible that complete removal of the 
inhibitory/ instability sequence elements from the 
lentiviral genome would result in constitutive expression. 
This could prevent the virus from establishing a latent 
infection and escaping immune system surveillance. The 
success in increasing expression of pl7 8>s by eliminating 
the inhibitory sequence element suggests that one could 
produce lentiviruses without any negative elements. Such 
lentiviruses could provide a novel approach towards 
attenuated vaccines. 

For example, vectors expressing high levels of 
Gag can be used in immunotherapy and immunoprophylaxis , 
after expression in humans. Such vectors include 
retroviral vectors and also include direct injection of 
DNA into muscle cells or other receptive cells, resulting 
in the efficient expression of gag, using the technology 
25 described, for example, in Wolff et al., Science 247:1465- 
1468 (1990), Wolff et al.. Human Molecular ggnetics 
l(6):363-369 (1992) and Ulmer et al.. Science 259:1745- 
1749 (1993) . Further, the gag constructs could be used in 
transdominant inhibition of HIV expression after the 
30 introduction into humans. For this application, for 

example, appropriate vectors or DNA molecules expressing 
high levels of pSS 8 * or p37 8 * would be modified to generate 
transdominant gag mutants, as described, for example, in 
Trono et al., Cell 59:113-120 (1989). The vectors would 
be introduced into humans, resulting in the inhibition of 
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HIV production due to the combined mechanisms of gag 
transdominant inhibition and of immunostimulation by the 
produced gag protein. In addition, the gag constructs of 
the invention could be used in the generation of new 
retroviral vectors based on the expression of lentiviral 
gag proteins. Lentiviruses have unique characteristics 
that may allow the targeting and efficient infection of 
non-dividing cells. Similar applications are expected for 
vectors expressing high levels of env. 

Identification of similar inhibitory/ instability 
elements in SIV indicates that this virus may provide a 
convenient model to test these hypotheses. 

The exemplified constructs can also be used to 
simply and rapidly detect and/or further define the 
boundaries of inhibitory/ instability sequences in any mRNA 
which is known or suspected to contain such regions , e.g., 
in mRNAs encoding various growth factors, interferons or 
interleukins , as well as other viral mRNAs in addition to 
those exemplified herein. 

The following examples illustrate certain 
embodiments of the present invention, but should not be 
construed as limiting its scope in any way. Certain 
modifications and variations will be apparent to those 
skilled in the art from the teachings of the foregoing 
disclosure and the following examples, and these are 
intended to be encompassed by the spirit and scope of the 
invention. 



EXAMPLE 1 
HIV-1 GAG GENE 
The interaction of the Rev regulatory protein of 
human immunodeficiency virus type 1 (HIV-l) with its RNA 
target, named the Rev- responsive element (RRE) , is 
necessary for expression of the viral structure proteins 
(for reviews see G. Pavlakis and B. Felber, New Biol. 
2:20-31 (1990); B. Cullen and w. Greene, Cell 51:423-426 
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(1989) ; and C. Rosen and G. Pavlakis, AIDS J. 4:499-509 

(1990) ). Rev acts by promoting the nuclear export and 
increasing the stability of the RRE- containing mRNAs. 
Recent results also indicate a role for REV in the 
efficient polysome association of these mRNAs (S. Arrigo 
and I. Chen, Gene Dev. 5:808-819 (1991), D. D'Agostino et 
al., Mol. Cell Biol. 12:1375-1386 (1992)). Since the RRE- 
containing HIV-1 mRNAs do not efficiently produce protein 
in the absence of Rev, it has been postulated that these 
mRNAs are defective and contain inhibitory/ instability 
sequences variously designated as INS, CRS, or IR (M. 
Emerman et al. Cell 5J7:1155-1165 (1989); S. Schwartz et 
al., <J. Virol. 66:150-159 (1992); C. Rosen et al., Proc. 
Natl. Acad. Sci. USA ^5:2071-2075 (1988); M. Hadzopoulou- 
Cladaras et al., J. Virol. 63:1265-1274 (1989); F. 

15 Maldarelli et al., J. Virol. £5:5732-5743 (1991); A. W. 
Cochrane et al., J. Virol. 65:5305-5313 (1991)). The 
nature and function of these inhibitory/ instability 
sequences have not been characterized in detail. It has 
been postulated that inefficiently used splice sites may 
be necessary for Rev function (D. Chang and P. Sharp, Cell 
59:789-795 (1989)); the presence of such splice sites may 
confer Rev- dependence to HIV-1 mRNAs. 

Analysis of HIV-1 hybrid constructs led to the 
initial characterization of some inhibitory/ instability 
25 sequences in the gag and pol regions of HIV-1 (S. Schwartz 
et al. , J. Virol. £6:150-159 (1992); F. Maldarelli et al., 
J Virol £5:5732-5743 (1991); A. W. Cochrane et al., J. 
Virol. £5_:5305-5313 (1991)). The identification of an 
inhibitory/ instability RNA element located in the coding 
region of the pl7 s *« matrix protein of HIV-i was also 
reported (S. Schwartz et al., J. Virol. 66:150-159 
(1992) ) . It was shown that this sequence acted in cis to 
inhibit HIV-l tat expression after insertion into a tat 
cDNA. The inhibition could be overcome by Rev- RRE, 
demonstrating that this element plays a role in regulation 
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by Rev. 

Pl7 gas e xpression plasmid 
To further study the inhibitory/ instability 
element in pl7« f a pi7« expression plasmid (pl7, Fig. l) 
was constructed. The pi7« sequence was engineered to 
contain a translational stop codon immediately after the 
coding sequence and thus could produce only pi7** (the 
construction of this plasmid is described below) . The 
major 5' splice site of HIV-i upstream of the gag AUG has 
been deleted from this vector (B. Felber et al., Proc. 
Natl. Acad. Sci. USA 8£: 1495 -1499 (1989)). To investigate 
whether plasmid pi7 could produce pi7« in the absence of 
Rev and the RRE, pl7 was transfected into HLtat cells (S. 
Schwartz et al., J. Virol,. 64:2519-2529 (1990)) (see 
below) . These cells constitutively produce HIV-1 Tat 
protein, which is necessary for trans activation of the 
HIV-i LTR promoter. Plasmid pl7 was transfected in the 
absence or presence of Rev, and the production of pl7 Eas 
was analyzed by western immunoblotting. The results 
revealed that very low levels of pi7 8ae protein were 
produced (Fig. 2A) . The presence of Rev did not increase 
gag expression, as expected, since this mRNA did not 
contain the RRE. Next, a plasmid that contained both the 
pl7 Ba8 coding sequence and the RRE (pl7R, Fig. l) was 
constructed. Like pl7, this plasmid produced very low 
levels of pi7« in the absence of Rev. High levels of 
P17* 1 * were produced only in the presence of Rev (Fig. 2A) . 
These experiments suggested that an inhibitory/instability 
element was located in the pl7*« coding sequence. 

Expression experiments using various eucaryotic 
vectors have indicated that several other retroviruses do 
not contain such inhibitory/ instability sequences within 
their coding sequences (see for example, J. Wills et al., 
J. Virol. 61:4331-43 (1989) and V. Morris et al . , J. 
Virol. 62:349-53 (1988)). To verify these results, the 
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pi7»* (matrix) gene of HIV-i in plasmid pl7 was replaced 
with the coding sequence for pl9 sag (matrix) which is the 
homologous protein of the Rous sarcoma virus (RSV, strain 
SR-A) . The resulting plasmid, pl9 (Fig. 1) , was identical 
to plasmid pl7, except for the gag coding sequence. The 
production of pl9 ga * protein from plasmid pl9 was analyzed 
by western immunoblotting, which revealed that this 
plasmid produced high levels of pl9 8as (Fig. 2 A) . These 
experiments demonstrated that the pl9 gag coding sequence of 
RSV, in contrast to pl7 g>s of HIV-l, could be efficiently 
expressed in this vector, indicating that the gag region 
of RSV did not contain any inhibitory/instability 
elements. A derivative of plasmid pi9 that contained the 
RRE, named pl9R (Fig. l) was also constructed. 
Interestingly, only very low levels of pl9 gsg protein were 
produced from the RRE- containing plasmid pl9R in the 
absence of Rev. This observation indicated that the 
introduced RRE and 3' HIV-l sequences exerted an 
inhibitory effect on pl9 sag expression from plasmid pl9R, 
which is in agreement with recent data indicating that in 
the absence of Rev, a longer region at the 3' end of the 
virus including the RRE acts as an inhibitory/ instability 
element (G. Nasioulas, G. Pavlakis, B. Felber, manuscript 
in preparation) . in conclusion, the high levels of 
expression of RSV pi9 gsg in the same vector reinforced the 
conclusion that an inhibitory/instability sequence within 
HIV-l pi7 g,g coding region was responsible for the very low 
levels of expression. 

It was next determined whether the 
inhibitory/instability effect of the pi7« coding sequence 
was detected also at the mRNA level. Northern blot 
analysis of RNA extracted from HLtat cells transfected 
with pl7 or transfected with pi7R demonstrated that pl7R 
produced lower mRNA levels in the absence of Rev (Fig. 3A) 
(See Example 3) . A two- to eight -fold increase in pl7R 
mRNA levels was observed after coexpression with Rev. 
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Plasmid pl7 produced mRNA levels similar to those produced 
by pl7R in the absence of Rev. Notably, Rev decreased the 
levels of mRNA and protein produced by mRNAs that do not 
contain RRE . This inhibitory effect of Rev in 
cotransfection experiments has been observed for many 
other non- RRE -containing mRNAs, such as lucif erase and CAT 
(L. Solomin et al . , J. Virol 64:6010-6017 (1990); D. M. 
Benko et al . , New Biol 2:1111-1122 (1990)). These results 
established that the inhibitory element in gag also 
affects the mRNA levels and are in agreement with previous 
findings (S. Schwartz et al., J. Virol. £6:150-159 
(1992)). Quantitations of the mRNA and protein levels 
produced by pl7R in the absence or presence of Rev were 
performed by scanning densitometry of appropriate serial 
dilutions of the samples, and indicated that the 
difference was greater at the level of protein (60- to 
100- fold) than at the level of mRNA (2- to 8-fold) . This 
result is compatible with previous findings of effects of 
Rev on mRNA localization and polysomal loading of both gag 
and env mRNAs (S. Arrigo et al., Gene Dev 5:808-819 
(1991); D. D'Agostino et al . , Mol . Cell. Biol. 12:1375- 
1386 (1992); M. Emerman et al., Cell 57:1155-1165 (1989); 
B. Felber et al., Proc. Natl. Acad. Sci. USA 8£: 1495-1499 
(1989), M. Malim et al., Nature (London) 338:254-257 
(1989) ) . Northern blot analysis of the mRNAs produced by 
the RSV gag expression plasmids revealed that pl9 produced 
high mRNA levels (Fig. 3B) . This further demonstrated 
that the pl9 8ag coding sequence of RSV does not contain 
inhibitory elements. The presence of the RRE and 3' HIV-i 
sequences in plasmid pl9R resulted in decreased mRNA 
levels in the absence of Rev, further suggesting that 
inhibitory elements were present in these sequences. 
Taken together, these results established that gag 
expression in HIV-l is fundamentally different from that 
in RSV. The HIV-l pi7 sag coding sequence contains a strong 
inhibitory element while the RSV pl9 gag coding sequence 
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does not. Interestingly, plasmid pl9 contains the 5' 
splice site used to generate the RSV env mRNA, which is 
located downstream of the gag AUG. This 5' splice site is 
not utilized in the described expression vectors (Fig. 
3B) . Mutation of the invariable GT dinucleotide of this 
5' splice site to AT did not affect pl9 gag expression 
significantly (data not shown) . On the other hand, the 
HIV-1 pi 7 expression plasmid did not contain any known 
splice sites, yet was not expressed in the absence of Rev. 
These results further indicate that sequences other than 
inefficiently used splice sites are responsible for 
inhibition of gag expression. 



2 . Mutated o!7 gag vectors 
To investigate the exact nature of the 
15 inhibitory element in HIV-l gag, site-directed mutagenesis 
of the pl7 Eag coding sequence with four different 
oligonucleotides, as indicated in Fig. 4, was performed. 
Each oligonucleotide introduced several point mutations 
over an area of 19-22 nucleotides. These mutations did 
not affect the amino acid sequence of the pi?*** protein, 
since they introduced silent codon changes. First, all 
four oligonucleotides were used simultaneously in 
mutagenesis using a single -stranded DNA template as 
described (T. Kunkel, Proc. Natl. Acad. Sci. USA 82:488- 
25 492 (1985); S. Schwartz et al., Mol. Cell. Biol- 12:207- 
219 (1992)). This allowed the simultaneous introduction 
of many point mutations over a large region of 270 nt in 
vector pl7. A mutant containing all four oligonucleotides 
was isolated and named pl7M1234. Compared to pl7 f this 
plasmid contained a total of 28 point mutations 
distributed primarily in regions with high AU- content. 
The phenotype of the mutant was assessed by transfections 
into HLtat cells and subsequent analysis of p!7 ga * 
expression by immunoblotting. Interestingly, p!7M1234 
produced high levels of pl7 ga * protein, higher than those 
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produced by pl7R in the presence of Rev (Fig. 2A) . This 
result demonstrated that the inhibitory/instability 
signals in pi7 sag mRNA had been inactivated in plasmid 
pl7Ml234. As expected, the presence of Rev protein did 
not increase expression from pl7M1234, but instead, had a 
slight inhibitory effect on gag expression. Thus, pl7 gag 
expression from the mutant pl7M1234 displayed the same 
general properties as the pi9« of RSV, that is, a high 
constitutive level of Rev- independent gag expression. 
Northern blot analysis revealed that the mRNA levels 
produced by pl7M1234 were increased compared to those 
produced by pl7 (Fig. 3 A) . 

To further examine the nature and exact location 
of the minimal inhibitory/ instability element, the pl7 ga s 
coding sequence in plasmid pl7 was mutated with only one 
of the four mutated oligonucleotides at a time. This 
procedure resulted in four mutant plasmids, named pl7Ml, 
P17M2, pl7M3, and pl7M4, according to the oligonucleotide 
that each contains. None of these mutants produced 
significantly higher levels of pi7** protein compared to 
plasmid pl7 (Fig. 5), indicating that the 

inhibitory/instability element was not affected. The pi 7 
coding sequence was next mutated with two oligonucleotides 
at a time. The resulting mutants were named pl7M12, 
P17M13, pl7M14, pl7M23, pl7M24, and pl7M34. Protein 
production from these mutants was minimally increased 
compared with that from pl7, and it was considerably lower 
than that from pl7M1234 (Fig. 5) . in addition, a triple 
oligonucleotide mutant, pl7Ml23, also failed to express 
high levels of pi7« (data not shown) . These findings may 
suggest that multiple inhibitory/ instability signals are 
present in the coding sequence of pi7 6ag . Alternatively, a 
single inhibitory/ instability element may span a large 
region, whose inactivation requires mutagenesis with more 
than two oligonucleotides. This possibility is consistent 
with previous data suggesting that a 218 -nucleotide 



WO 93/20212 



PCT/US93/02908 



50 



10 



15 



20 



25 



30 



35 



inhibitory/ instability element in the pil*^ coding 
sequence is required for strong inhibition of gag 
expression. Further deletions of this sequence resulted 
in gradual loss of inhibition (S. Schwartz et al., J. 
Virol. 6£: 150-159 (1992)). The inhibitory/ instability 
element may coincide with a specific secondary structure 
on the mRNA. It is currently being investigated whether a 
specific structure is important for the function of the 
inhibitory/ instability element. 

The pl7 gi « coding sequence has a high content of 
A and U nucleotides, unlike the coding sequence of pig 8 " 8 
of RSV (S. Schwartz et al. , J. Virol. 66:150-159 (1992); 
G. Myers and G. Pavlakis, in The Retiroviridag J. Levy, 
Eds. (Plenum Press, New York, NY, 1992), pp. 1-37). Four 
regions with high AU content are present in the pl7 e '« 
coding sequence and have been implicated in the inhibition 
of gag expression (S. Schwartz et al., J. Virol. 66:150- 
159 (1992) ) . Lentiviruses have a high AU content compared 
to the mammalian genome. Regions of high AU content are 
found in the gag/pol and env regions, while the multiply 
spliced mRNAs have a lower AU content (G. Myers and G. 
Pavlakis, in The Ret roviridap . j. Levy, Eds. (Plenum 
Press, New York, NY, 1992), pp. 1-37), supporting the 
possibility that the inhibitory/instability elements are 
associated with mRNA regions with high AU content. It has 
been shown that a specific oligonucleotide sequence, 
AUUUA, found at the AU-rich 3' untranslated regions of 
some unstable mRNAs, may confer RNA instability (G. Shaw 
and R. Kamen, Cell 46:659-667 (1986)). Although this 
sequence is not present in the pl7 ets sequence, it is found 
in many copies within gag/pol and env regions. The 
association of instability elements with AU-rich regions 
is not universal, since the RRE together with 3' HIV 
sequences, which shows a strong inhibitory/instability 
activity in our vectors, is not AU-rich. These 
observations suggest the presence of more than one type of 
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inhibitory/ instability sequences. In addition to reducing 
the AU content, some of the mutations introduced in 
plasmid pi 7 changed rarely used codons to more favored 
codons for human cells. Although the use of rare codons 
could be an alternative explanation for poor HIV gag 
expression, this type of translational regulation is not 
favored by these results, since the presence of Rev 
corrects the defect in gag expression. In addition, the 
observation that the presence of non- translated sequences 
reduced gag expression (for example, the RRE sequence in 
pl7R) , suggests that translation of the 
inhibitory/instability region is not necessary for 
inhibition. Introduction of RRE and 3' HIV sequences in 
P17M1234 was also able to decrease gag expression, 
verifying that independent negative elements not acting 
co-translationally are responsible for poor expression. 

3 . Identification and elimination of 

additional INS sequences in the p24 and pi5 
regions of the gag oenp 

To examine the effect of removal of INS in the 
pl7** coding region (the pi7« coding region spans 
nucleotides 336-731, as described in the description of 
Fig. 1. (B) above, and contains the first of three parts 
(i.e., pl7, p24, and pl5) of the gag coding region, as 
25 indicated on in Fig. 1. (A) and (B) ) on the expression of 
the complete gag gene expression vectors were constructed 
in which additional sequences of the gag gene were 
inserted 3' to the mutationally altered pi7^ coding 
region, downstream of the stop codon, of vector P 17M1234. 
3Q Three vectors containing increasing lengths of gag 

sequences were studied: pl7M1234 (731-1081) , pl7M1234 (731- 
1424) and pl7M1234 (731-2165) , as shown in Fig. 1. ( C ) . 
Levels of expression of pl7^ were measured, with the 
results indicating that region of the mRNA encoding the 
35 second part of the gag protein (i.e., the part encoding 
the p24 8ag protein, which spans nucleotides 731-1424) 
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contains only a weak INS , as determined by a small 
reduction in the amount of p!7 gag protein expressed by 
p!7M1234 as compared with the amount of pl7 gag protein 
expressed by pl7M1234 (731-1424) , while the region of the 
mRNA encoding the third part of the gag protein (i.e., the 
5 part encoding the pl5 gag protein, which spans nucleotides 

1425-2165) contains a strong INS, as determined by a large 
reduction in the amount of gag protein expressed by 
pl7M1234 (731-2165) as compared with the amount of protein 
expressed by pl7M1234 and p!7M1234 (731-1424) . 

10 

4. TD37M1234 vector 

The above analysis allowed the construction of 
vector p37M1234, which expressed high levels of p37 gag 
precursor protein (which contains both the pl7 gag and p24 gag 
15 protein regions) . Vector p37M1234 was constructed by 

removing the stop codon at the end of the gene encoding 
the altered ^17** protein and fusing the nucleotide 
sequence encoding the p24 gag protein into the correct 
reading frame by oligonucleotide mutagenesis. This 

20 

restored the nucleotide sequence so that it encoded the 
fused pl7 gag and p24 gag protein (i.e., the p37 gag protein) as 
it is encoded by HIV- 1 . Since the presence of the p3 7 gag 
or of the p24 gag protein can be quantitated easily by 
^ commercially available ELISA kits, vector p37M1234 can be 
used for inserting and testing additional fragments 
suspected of containing INS. Examples of such uses are 
shown below. 



5 - Vectors P17M1234 (731-1081) NS and P55BM1234 
Other vectors which, were constructed in a 
similar manner as was P37M1234 were pl7Ml234 (731-1081) NS 
and p55BM1234 (Pig. i. (C) ) . The levels of gag expression 
from each of these three vectors which allow the 
translation of the region downstream (3 f ) of the p!7 
coding region, was respectively similar to the level of 
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gag expression from the vectors containing the nucleotide 
sequences 3' to a stop codon (i.e., vectors pl7M1234 (731- 
1081), pl7M1234 (731-1424) and pl7M1234 (731-2165) , 
described above) . These results also demonstrate that the 
INS regions in the gag gene are not affected by 
translation or lack thereof through the INS region. These 
results demonstrate the use of pl7M1234 to detect 
additional INS sequences in the HIV-l gag coding region 
(i.e., in the 1424-2165 encoding region of HIV-l gag). 
Thus, these results also demonstrate how a gene containing 
one or more inhibitory/instability regions can be mutated 
to eliminate one inhibitory/ instability region and then 
used to further locate additional inhibitory/instability 
regions within that gene, if any. 

6 - Vectors n37Mi-mn and p^Mi-m 
As described above, experiments indicated the 
presence of INS in the p24 and pl5 region of HIV-l in 
addition to those identified and eliminated in the pl7*«e 
region of HIV-l. This.. is depicted schematically in Figure 
6 on page 7180 of Schwartz et al . , J. Virol. 66:7176-7182 
(1992). In that figure, cgagM1234 is identical to 
P55BM1234. 

By studying the expression of p24 8ag protein in 
vectors encoding the p24^ protein containing additional 
gag and pol sequences, it was found that vectors that 
contained the complete gag gene and part of the pol gene 
(e.g. vector p55BM1234, see Pig. 6 ) were not expressed at 
high levels, despite the elimination of INS-1 in the pl7^ 
region as described above. The inventors have 
hypothesized that this is caused by the presence of 
multiple INS regions able to act independently of each 
other. To eliminate the additional INS, several mutant 
HIV-l oligonucleotides were constructed (see Table 2) and 
incorporated in various gag expression vectors. For 
example, oligonucleotides M6gag, M7gag, M8gag and MIOgag 
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were introduced into p37M1234, resulting in p37Ml-10D and 
the same oligonucleotides were introduced into p55BMl234, 
resulting in p55BMl-10. These experiments revealed a 
dramatic improvement of expression of p37 gag (which is the 
pl7*«s and p24 sag precursor) and p55 Eag (which is the intact 
gag precursor molecule produced by HIV-l) upon the 
incorporation in the expression vectors p37M1234 and 
p55BM1234 of additional mutations contained in the 
oligonucleotides M6gag, M7gag, M8gag and MIOgag (described 
in Table 2) . Fig. 6 shows that expression was 
dramatically improved after the introduction of additional 
mutations . 

Of particular interest was p37Ml-10D, which 
produced very high levels of gag. This has been the 
highest producing gag construct (see Fig. 6) . 
Interestingly, addition of gag and pol sequences as in 
vectors p55BMl-lO and p55AMl-i0 (Fig. 6) reduced the 
levels of gag expression. Upon further mutagenesis, the 
inhibitory effects of this region were partially 
eliminated as shown in Fig. 6 for vector p55Ml-13P0. 
Introduction of mutations defined by the gag region 
nucleotides MIOgag, Mllgag, M12gag, M13gag, and pol region 
nucleotide MOpol increased the levels of gag expression 
approximately six fold over vectors such as p55BMl-10. 

The HIV-l promoter was replaced by the human 
cytomegalovirus early promoter (CMV) in plasmids p37Ml-10D 
and p55Ml-l3P0 to generate plasmids pCMV37Ml-10D and 
pCMV55Ml-13P0, respectively. For this, a fragment 
containing the CMV promoter was amplified by PCR 
(nucleotides -670 to +73, where +1 is the start of 
transcription, see, Boshart, et al.. Cell . 41, 521 
(1985) ) . This fragment was exchanged with the StuI - 
BssHII fragment in gag vectors p37Ml-10D and p55Ml-l3P0, 
resulting in the replacement of the HIV-l promoter with 
that of CMV. The resulting plasmids were compared to 
those containing the HIV-l promoter after transfection in 



WO 93/20212 



PCI7US93/02908 



55 - 



10 



human cells, and gave similar high expression of gag. 
Therefore, the high expression of gag can be achieved in 
the total absence of any other viral protein. The 
exchange of the -HIV- l with other promoters is beneficial 
if constitutive expression is desirable and also for 
expression in other mammalian cells, such as mouse cells, 
in which the HIV-l promoter is weak. 

The constructed vectors p37Ml-lOD and p55BMl-10 
can be used for the Rev independent production of p3 7 sag 
and P 55« proteins, respectively. m addition, these 
vectors can be used as convenient reporters, to identify 
and eliminate additional INS in different RNA molecules. 

Using the protocols described herein, regions 
have been identified within the g P 41 (the transmembrane 
part of HIV-l env) coding area and at the post -env 3' 
15 region of HIV-l which contain INS. The elimination of INS 
from gag, pol and env regions will allow the expression of 
high levels of authentic HIV-l structural proteins in the 
absence of the Rev regulatory factor of HIV-l. The 
mutated coding sequences can be incorporated into 
appropriate gene transfer vectors which may allow the 
targeting of specific cells and/or more efficient gene 
transfer. Alternatively, the mutated coding sequences can 
be used for direct expression in human or other cells in 
Yi£ro or in vivo with the goal being the production of 
high protein levels and the generation of a strong immune 
response. The ultimate goal in either case is subsequent 
protection from HIV infection and disease. 

The described experiments demonstrate that the 
inhibitory/instability sequences are required to prevent 
HIV-l expression. This block to the expression of viral 
structural proteins can be overcome by the Rev-RRE 
interaction. In the absence of INS, HIV-l expression 
would be similar to simpler retroviruses and would not 
^ require Rev. Thus, the INS is a necessary component of 
J - Rev regulation. Sequence comparisons suggest that the INS 
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element identified here is conserved in all HIV-l 
isolates, although this has not been verified 
experimentally. The majority (22 of 28) of the mutated 
nucleotides in gag are conserved in all HIV-l isolates, 
while 22 of 28 are conserved also in HIV- 2 (G. Myers, et 
al. f Eds. Human retrovirus es and AIDS . A compilation and 
analysis of nucleic a cid and amino acid sequences (Los 
Alamos National Laboratory, Los Alamos, New Mexico, 1991) , 
incorporated herein by reference) . Several lines of 
evidence indicate that all lentiviruses and other complex 
retroviruses such as the HTLV group contain similar INS 
regulatory elements. Strong INS elements have been 
identified in the gag region of HTLV- I and SIV (manuscript 
in preparation) . This suggests that INS are important 
regulatory elements, and may be responsible for some of 
15 the biological characteristics of the complex 

retroviruses. The presence of INS in SIV and HTLV- I 
suggests that these elements are conserved among complex 
retroviruses. Since INS inhibit expression, it must be 
concluded that their presence is advantageous to the 
virus, otherwise they would be rapidly eliminated by 
mutations . 

The observations that the inhibitory/instability 
sequences act in the absence of any other viral proteins 
and that they can be inactivated by mutagenesis suggest 
that these elements may be targets for the binding of 
cellular factors that interact with the mRNA and inhibit 
post transcriptional steps of gene expression. The 
interaction of HIV-l mRNAs with such factors may cause 
nuclear retention, resulting in either further splicing or 
rapid degradation of the mRNAs. It has been proposed that 
components of the splicing machinery interact with splice 
sites in HIV-l mRNAs and modulated mRNA expression (A. 
Cochrane et al., J. Virol. 65:5305-5313 (1991); D. Chang 
and P. Sharp, Cell 5.9:789-795 (1989); X. Lu et al., Proc. 
Natl. Acad. Sci. USA 87:7598-7602 (1990)). However, it is 
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not likely that the inhibitory/ instability elements 
described here are functional 5' or 3' splice sites. 
Thorough mapping of HIV-l splice sites performed by 
several laboratories using the Reverse Transcriptase -PCR 
technique failed to detect any splice sites within gag (S. 
Schwartz et al. f J. Virol. 64:2519-2529 (1990); J. 
Guatelli et al., J. Virol. 64:4093-4098 (1990); E. D. 
Gerrett et al., J. Virol. 65:1653-1657 (1991); M. Robert - 
Guroff et al., J. Virol. 64:3391-3398 (1990); S. Schwartz 
et al., J. Virol. 64:5448-5456 (1990); S. Schwartz et 
al., Virology 183:677-686 (1991)). The suggestions that 
Rev may act by dissociating unspliced mRNA from the 
splicesomes (D. Chang and P. Sharp, Cell 59:789-795 
(1989)) or by inhibiting splicing (J. Kjems et al., Cell 
67:169-178 (1991)) are not easily reconciled with the 
15 knowledge that all retroviruses produce structural 

proteins from mRNAs that contain unutilized splice sites. 
Splicing of all retroviral mRNAs, including HIV-l mRNAs in 
the absence of Rev, is inefficient compared to splicing of 
cellular mRNAs (J. Kjems et al., Cell 67:169-178 (1991); 
20 A. Krainer et al. , Gene Dev. 4:1158-1171 (1990); R. Katz 
and A. Skalka, Mol. Cell. Biol. 10:696-704 (1990); C. 
Stoltzfus and S. Fogarty, J. Virol. 63:1669-1676 (1989)). 
The majority of the retroviruses do not produce Rev- like 
proteins, yet they efficiently express proteins from 
partially spliced mRNAs, suggesting that inhibition of 
expression by unutilized splice sites is not a general 
property of retroviruses. Experiments using constructs 
expressing mutated HIV-l gag and env mRNAs lacking 
functional splice sites showed that only low levels of 
these mRNAs accumulated in the absence of Rev and that 
their expression was Rev-dependent (M. Emerman et al., 
Cell 57:1155-1165 (1989); B. Felber et al., Proc. Natl. 
Acad. Sci. USA 86:1495-1499 (1989); M. Malim et al . , 
Nature (London) 338:254-257 (1989)). This led to the 
conclusion that Rev acts independently of splicing (B. 
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Felber et al., Proc. Natl. Acad. Sci. USA 86:1495-1499 
(1989) r M. Malim et al., Nature (London) 338 :254-257 
(1989)) and to the proposal that inhibitory/ instability 
elements other than splice sites are present on HIV-1 
mRNAs (C. Rosen et al., Proc. Natl. Acad. Sci. USA 
85:2071-2075 (1988); M. Hadzopoulou-Cladaras, et al . , J. 
Virol." 63:1265-1274 (1989); B. Felber et al., Proc. Natl. 
Acad. Sci. USA 86:1495-1499 (1989)). 
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Construction of the Gag Expression Plasmids 

Plasmid pl7R has been described as pNL17R (S. 
Schwartz et al., J. Virol. 66:150-159 (1992)). Plasmid 
pl7 was generated from pl7R by digestion with restriction 
enzyme Asp718 followed by religation. This procedure 
deleted the RRE and HIV-l sequences spanning nt 8021-8561 
15 upstream of the 3' LTR. To generate mutants of pl7 gag , the 
pl7 gag coding sequence was subcloned into a modified 
pBLUESCRIPT vector (Stratagene) and generated single 
stranded uracil -containing DNA. Site-directed mutagenesis 
was performed as described (T. Kunkel, Proc. Natl. Acad. 
Sci. USA 82:488-492 (1985); S. Schwartz et al., Mol. Cell 
Biol. 12:207-219 (1992)).: Clones containing the 
appropriate mutations were selected by sequencing of 
double- stranded DNA. To generate plasmid pl9R, plasmid 
pl7R was first digested with BssHII and EcoRI, thereby 
deleting the entire pi7 gag coding sequence, six nucleotides 
upstream of the pl7 gas AUG and nine nucleotides of linker 
sequences 3' of the pl7 gag stop codon. The p!7 gag coding 
sequence in pl7R was replaced by a PCR- amplified DNA 
fragment containing the RSV pl9 gag coding sequence (R. 
Weiss et al . , RNA Tumor Viruses. Molecular Biology of 
Tumor Viruses (Cold Spring Harbor Laboratory, Cold Spring 
Harbor, New York, 1985) ) . This fragment contained eight 
nucleotides upstream of the RSV gag AUG and the pl9 gJg 
coding sequence immediately followed by a translational 
stop codon. The RSV gag fragment was derived form the 
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infectious RSV proviral clone S-RA (R. Weiss et al . , RNA 

Tumor Viruses. Molecular Biology n f Tumnr v-in, aog (Co ld 

Spring Harbor Laboratory, Cold Spring Harbor, New York, 
1985)) . pl9 was derived from pl9R by excising an Asp 718 
fragment containing the RRE and 3' HIV-l sequences 
spanning nt 8021-8561. 

Transfection of HT,tat Ce l ls With r^er nx Dr SSB inn Pi, CT nHc 
HLtat cells (S. Schwartz et al . , J. Virol. 
64:2519-2529 (1990)) were transfected using the calcium 
coprecipitation technique (F. Graham et al . and A. Van der 
Eb, Virology £2:456-460 (1973)) as described (B. Felber et 
al., Proc. Natl. Acad. Sci. USA 86: 1495-1499 (1989)), 
using 5 fig of pl7, pl7R, pl7M1234, pl9 , or pl9R in the 
absence (-) or presence ( + ) of 2 M g of the Rev -expressing 
plasmid pL3crev (B. Felber et al., Proc. Natl. Acad. Sci. 
USA 86:1495-1499 (1989)). The total amount of DNA in 
transfections was adjusted to 17 /xg per 0.5 ml of 
precipitate per 60 mm plate using pUC19 carrier DNA. 
Cells were harvested 20 h after transfected and cell 
extracts were subjected to electrophoresis on 12.5% 
denaturing polyacrylamide gels and analyzed by 
immunoblotting using either human HIV-l patient serum 
(Scripps) or a rabbit anti-pi9** serum. pRSV-lucif erase 
(J. de Wet et al., Mol. Cell. Biol. 7:725-737 (1987)) that 
contains the firefly lucif erase gene linked to the RSV LTR 
promoter, was used as an internal standard to control for 
transfection efficiency and was quantitated as described 
(L. Solomin et al., J. Virol. 64:6010-6017 (1990)). The 
results are set forth in Fig. 2. 

Northern Ri ot Analysis 
HLtat cells were transfected as described above 
and harvested 20 h post transfection. Total RNA was 
prepared by the heparin/DNase method (Z. Krawczyk and C. 
Wu, Anal. Biochem. 165:20-27 (1987)), and 20 M g of total 
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RNA was subjected to northern blot analysis as described 
(M. Hadzopoulou-Cladaras et al., J. Virol. 63:1265-1274 
(1989)). The filters were hybridized to a nick- translated 
PCR- amplified DNA fragment spanning nt 8304-9008 in the 
HIV-l 3' LTR. The results are set forth in Fig. 3. 

EXAMPLE 3 
HIV-l ENV GENE 
Fragments of the env gene were inserted into 
vectors pl9 or p37M1234 and the expression of the 
resulting plasmids were analyzed by transfections into 
HLtat cells. It was found that several fragments 
inhibited protein expression. One of the strong INS 
identified was in the fragment containing nucleotides 
8206-8561 ("fragment [8206-8561]"). To eliminate this 
INS, the following oligonucleotides were synthesized and 
used in mutagenesis experiments as specified supra. The 
fragment was derived from the molecular clone pNL43, which 
is almost identical to HXB2. The numbering system used 
herein follows the numbering of molecular clone HXB2 
throughout. The synthesized oligonucleotides follow the 
pNL43 sequence. 

The oligonucleotides which were used to 
mutagenize fragment [8206-8561], and which made changes in 
the env coding region between nucleotides 8210-8555 (the 
25 letters in lower case indicate mutated nucleotides) were: 

#1: 

8194-8261 

GAATAGTGCTGTTAACcTcCTgAAcGCtACcGCtATcGCcGTgGCgGAaGGaACcGAc 
30 AGGGTTATAG (SEQ ID NO: 10) 

#2 

8262-8323 

AAGTATTACAAGCcGCcTAccGcGGcATcaGaCAtATcCCccGccGcATccGcCAGGG 
35 CTTG (SEQ ID NO: 11) 
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#3 

8335-8392 

GCTATAAGATGGGcGGtAAaTGGagcAAgtcctccGTcATcGGcTGGCCTGCTGTAAG 
(SEQ ID NO: 12) 

5 #4 

8393-8450 

GGAAAGAATGcGcaGgGCcGAaCCcGCcGCcGAcGGaGTtGGcGCcGTATCTCGAGAC 
(SEQ ID NO: 13) 

10 # 5 

8451-8512 

CTAGAAAAACAcGGcGC C AT t AC C t c C t C t AAc AC cGC cGC C AAt AAcGCcGCTTGTG 
CCTG (SEQ ID NO: 14) 

15 # 6 

8513-8572 

GCTAGAAGCACAgGAaGAaGAgGAaGTcGGcTTcCCcGTtACcCCTCAGGTACCTTTA 
AG (SEQ ID NO: 15) 
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The expression of env was increased by the 
elimination of the INS in fragment [8206-8561] as 
determined by analysis of both mRNA and protein. 

To further characterize in detail the INS in 
HIV-l env, the coding region of env was divided into 
different fragments, which were produced by PCR using 
appropriate synthetic oligonucleotides, and cloned in 
vector p37Ml-iOD. This vector was produced from p37M1234 
by additional mutagenesis as described above. After 
introduction into human cells, vector p37Ml-iOD produces 
high levels of p37« protein. Any strong INS element will 
inhibit the expression of gag if ligated in the same 
vector. The summary of the env fragments used is shown in 
Figure 11. The results of these experiments show that, 
like in HIV-l gag, there exist multiple regions inhibiting 
expression in HIV-l env, and combinations of such regions 
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result in additive or synergistic inhibition. For 
example, while fragments 1, 2, or 3 individually inhibit 
expression by 2-6 fold, the combination of these fragments 
inhibits expression by 30 fold. Based on these results, 
additional mutant oligonucleotides have been synthesized 
for the correction of env INS . These oligonucleotides 
have been introduced in the expression vectors for HIV-1 
env pl20pA and pl20R270 (see Fig. 7) for the development 
of Rev- independent HIV-l env expression plasmids as 
discussed in detail below. 

1. The mRNAs for gpl6G and for the 

extracellular domain (gpl20) are defective 
and their expression depends on the 
presence of RR E in r.i .g and Rev in trans 



15 

1.1 Positive and Negative Determinants for 
env mRNA Expression of HIV 



Previous experiments on the identification and 
characterization of the env expressing cDNAs had 
demonstrated that Env is produced from mRNAs that contain 
exon 4AE, 4BE, or 5E. (Schwartz et al., J. Virol. 
64:5448-5456 (1990); Schwartz et al., Mol . Cell. Biol. 
12:207-219 (1992) . All constructs generated to study the 
determinants of env expression are derived from pNL15E. 
This plasmid contains the HIV-1 LTR promoter, the complete 
env cDNA 15E, and the HIV 3' LTR including the 
polyadenylation signal (Schwartz, et al. J. Virol. 
64:5448-5456 (1990) (Fig. 7) . pNL15E was generated from 
the molecular clone pNL4-3 (pNL4-3 is identical to pNL43 
herein) (Adachi et al., J. Virol. 59:284-291 (1986) and 
lacks the splice acceptor site for exon 6D, which was used 
to generate the tev mRNA (Benko et al., J. Virol. 64:2505- 
2518 (1990) . The Env expression plasmids were transfected 
in the presence or absence of the Rev- expressing plasmid 
pL3crev (Felber et al. , J. Virol. 64:3734-3741 (1990) into 
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HLtat cells (Schwartz et al . , J. Virol. 64:2519-2529 
(1990) , which constitutively express Tat (one-exon Tat) . 
One day later, the cells were harvested for analyses of 
RNA and protein. Total RNA was extracted and analyzed on 
Northern blots. Protein production was measured by 
Western blots to detect cell -associated Env. In the 
absence of Rev, NL15E mRNA was efficiently spliced and 
produced Nef; in the presence of Rev, most of the RNA 
remained unspliced and produces the Env precursor gpl60, 
which is processed to gpl20, the secreted portion of the 
precursor and gp4l. 

To allow for the effects of INS to be 
distinguished and studied separately from splicing, splice 
sites known to exist within some of the fragments used 
were eliminated as discussed below. Analysis of the 
resulting expression vectors included size determination 
of the produced mRNA, providing the verification that 
splicing does not interfere with the interpretation of the 
data. 

20 1-2 Env expression is Rev-dependent also 

in the absence of functional splice 
sites 

To study the effect of splicing on env 
expression, the splice donor at nt 5592 was removed by 

25 site -directed mutagenesis (changing GCAGTA to GaAtTc, and 
thus introducing an EcoRI site) , which resulted in plasmid 
15ESD- (Fig. 7) . The mRNA from this construct was 
efficiently spliced and produced a small mRNA encoding Nef 
(Fig. 8) . Sequence analysis revealed that this spliced 

30 mRNA was generated by the use of an alternative splice 
donor located at nt 5605 (TACATgtaatg) and the common 
splice acceptor site at nt 7925. In contrast to published 
work (Lu et al., Proc. Natl. Acad. Sci. USA 87:7598-7602 
(1990) , expression of Env from this mutant depended on 

35 Rev. Next, the splice acceptor site was mutated at nt 

7925. Since previous cDNA cloning had revealed that in 
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addition to the splice acceptor site at nt 7925 there are 
two additional splice acceptor sites at nt 7897 and nt 
7901 (Schwartz, et al. J. Virol. 64:2519-2529 (1990), this 
region of 43 bp encompassing nt 7884 to nt 7926 was 
removed. This resulted in plSEDSS (Fig. 7) . Northern 
blot analysis of mRNA from HLtat cells transfected with 
this construct confirmed that the 15EDSS mRNA is not 
spliced (Fig. 8B) . Although all functional splice sites 
have been removed from plSEDSS, Rev is still required for 
Env production (Fig. 8A) . Taken together with data 
obtained by studying gag expression, these results suggest 
that the presence of inefficiently used splice sites is 
not the primary determinant for Rev- dependent Env 
expression. It is known that at least two unused splice 
sites are present in this mRNA (the alternative splice 
15 donor at nt 5605 and the splice donor of exon 6D at nt 
6269) . Therefore, it cannot be ruled out that initial 
spliceosome formation can occur, which does not lead to 
• the execution of splicing. It is possible that this is 
sufficient to retain the mRNA in the nucleus and, since no 
splicing occurs, that this would lead to degradation of 
the mRNA. Alternatively, it is possible that 
splice- site- independent RNA elements similar to those 
identified within the gag/pol region (INS) are responsible 
for the Rev dependency (Schwartz et al. , J. Virol. 
66:7176-7182 (1992); Schwartz et al., J. Virol. 66:150- 
159 (1992) . 

1.3 Identification of negative elements 
within O P120 mRNA 
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To distinguish between these possibilities, a 
series of constructs were designed that allowed the 
determination of the location of such INS elements. 
First, a stop codon followed by the restriction sites for 
Nrul and Mlul was introduced at the cleavage site between 
the extracellular gpi20 and the transmembrane protein gp4l 
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at nt 7301 in plasmid NL15EDSS , resulting in pl20DSS (Fig. 
7) . Immunoprecipitation of gpi20 from the medium of cells 
transfected with pi20DSS confirmed the production of high 
levels of gpl20 only in the presence of Rev (Fig. 9B) . 
The release of gpi20 is very efficient, since only barely 
detectable amounts remain associated with the cells (data 
not shown) . This finding rules out the possibility that 
the translation of the gp4l portion of the env cDNA is 
responsible for the defect in env expression. Next, the 
region 3' of the stop codon of gpi20 (consisting of gp4l, 
including the RRE and 3' LTR) with the SV40 
polyadenylation signal (Fig. 7) was replaced. This 
construct, P 120pA, produced very low levels of gpl20 in 
the absence of Rev (Fig. 9B) . Background levels of Env 
were produced from pl20DR (Fig. 7), which was generated 
from pBS120DSS by removing the 5' portion of gp4i 
including the RRE (Mlul to Hpal at nt 8200) (Fig. 9B) . 
These results demonstrate the presence of a major INS -like 
sequence within the gpi20 portion. To study the effect of 
•Rev on this mRNA, different RREs (RRE330, RRE270, and 
RRED345 (Solomin et al. f J. Virol. 64:6010-6017 (1990) 
were inserted into P 120pA downstream of the gpl20 stop 
codon, resulting in pl20R330, pl20R270, and pl20RD345, 
respectively (Fig. 7) . Immunoprecipitations demonstrated 
that the presence of Rev in trans and the RRE in cis could 
rescue the defect in the gpi20 expression plasmid. High 
levels of gpi20 were produced from pl20R330 (data not 
shown), pl20R270, and pl20RD345 (Fig. 9B) in the presence 
of Rev. 

Northern blot analysis (Fig. 8A) confirmed the 
protein data. The presence of Rev resulted in the 
accumulation of high levels of mRNA produced by P BS120DSS, 
P120R270, and pl20RD345. Low but detectable levels of RNA 
were produced from pl20DpA and pl20DR. 
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2. Identification of INS elements located 
within the env mRNA regions using two 
strategies 

To identify elements that have a down regulatory 
effect in vivo, fragments of env cDNA were inserted into 
two different test expression vectors, pl9 and p37Ml-10D. 
These vectors contain a strong promoter for rapid 
detection of the gene product, such as the HIV-1 LTR in 
the presence of Tat, and an indicator gene that is 
expressed at high levels and can easily be assayed such as 
pl9 g,,s of RSV or the mutated p37 8i8 gene of HIV-l 
(p37Ml-10D) , neither of which contains any known INS -like 
elements. Expression vector pl9 contains the HIV-1 LTR 
promoter, the RSV pi9»* matrix gene, and HIV-1 sequences 
starting at Kpnl (nt 8561) including the complete 3' LTR 
15 (Schwartz, et al., J. Virol. 66:7176-7182 (1992). Upon 
transfection into HLtat cells high levels of pl9gag are 
constitutively produced and are visualized on Western 
blots. Expression vector p37Ml-iOD contains the HIV-l LTR 
promoter, the mutant p37gag (Ml- 10) , and the 3' portion of 
the virus starting at Kpnl (nt 8561) . Upon transfection 
into HLtat cells this plasmid constitutively produces 
p37 g,ls that can be quant itated by the HIV-l p24 e,,g antigen 
capture assay. 

25 2.1 Identification of INS elements using 

the RS V aaa expression vector 

INS elements within the gp41 and gpi20 portions 
were identified. To this end, the vector pl9 was used and 
the following fragments (Fig. 10) were inserted: (A) nt 
7684 to 7959; (B) nt 7684 to 7884 and nt 7927 to 7959; 
this is similar to fragment A but has the region of the 
splice acceptors 7A, 7B and 7 deleted; (C) nt 7595 to 7884 
and nt 7927 to 7959, having the splice sites deleted as in 
B; (D) nt 7939 to 8066; (E) nt 7939 to 8416; (F) nt 8200 
to 8561 (Hpal-Kpnl) ; (G) nt 7266 to 7595 containing the 
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intact RRE; (H) nt 5523 to 6190, having the splice donor 
SD5 deleted. 

Fragments A, B, and D did not affect Gag 
expression, whereas fragment G (RRE) decreased gag 
expression approximately 5x. Fragment C, E, and H lowered 
Gag expression by about 10 -20 -fold indicating the presence 
of INS elements. 

Interestingly, it was observed that the 
insertion of element F spanning 350 bp in plasmid pl9 
abolished production of Gag, indicating the presence of a 
strong INS within this element. The presence of the RRE 
in cis and Rev in trans resulted in production of high 
levels of RSV pi9^. Fragment F also had a smaller 
downregulatory effect on the expression of the 
INS -corrected pi 7« of HIV-1 (pl7M1234) . These 
experiments revealed the presence of multiple elements 
located within the env mRNA that cause inhibition of pi9^ 
expression. 

2.2 Elimination of the INS within 
20 fragment F 

Six synthetic oligonucleotides (Table 3) were 
generated that introduced 103 point mutations within this 
region of 330 nt without affecting the amino acid 
composition of Env. The mutated fragment F was tested in 
P19 to verify that the INS elements are destroyed. The 
introduction of the mutations within oligo#i only 
marginally affected the expression of pig« # whereas the 
presence of all oligos (#1 to #6) completely inactivated 
the INS effect of fragment F. This is another example 
that more than one region within an INS element needed to 
by mutagnenized to eliminate the INS effect. 

It is noteworthy that this INS element is 
present in all the multiply spliced Rev- independent mRNAs, 
such as tat, rev and nef . Experiments were performed to 
define the function of fragment F within the class of the 
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small mRNAs by removing this fragment from the tat cDNA. 
In the context of this mRNA, this element confers only a 
weak INS effect (3-5- fold inhibition), which suggests 
that inhibition of expression in env mRNA may require the 
presence of at least two distinct elements. These results 
suggested that the INS effect within env is based on 
multiple interacting components. Alternatively, the 
relative location and interactions among multiple INS 
components may be important for the magnitude of the INS 
effect. Therefore, more than one type of analysis in 
different vectors may be necessary for the identification 
and elimination of ins . 

2.3. Identification of ins elements using 
P37M1-10D exp ression vector 

The env coding region was subdivided into 
different consecutive fragments. These fragments and 
combinations of thereof were PCR-amplif ied using oligos as 
indicated in Fig. n and inserted downstream of the 
mutated p37** gene in p37Ml-10D. The plasmids were 
transfected into HLtat cells that were harvested the next 
day and analyzed for P 24« expression. Fig. n shows that 
the presence of fragments 2, 3, 5 as well as the 
combination 1+2+3 lowered gag expression substantially. 
Different oligos (Table 4) were synthesized that change 
the AT- rich domains including the three AATAAA elements 
located within the env coding region by changing the 
nucleotide but not the amino acid composition of Env. in 
a first approach, these oligos 1-19 are being introduced 
into plasmid pl20R270 with the goal or producing gpl20 in 
a Rev- independent manner. Oligonucleotides such as oligos 
20-26 will then be introduced into the gp4l portion, the 
two env portions combined and the complete gpl60 expressed 
in a Rev- independent manner. 
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example: -3, 

PROTO- ONCOGENE C-FOS 
Fragments of the fos gene were inserted into the 
vector pi9 and the expression of the resulting plasmids 
were analyzed by transf ections into HLtat cells. It was 
found that several fragments inhibited protein expression. 
A strong INS was identified in the fragment containing 
nucleotides 3328-3450 ("fragment [3328-3450]") 
(nucleotides of the fos gene are numbered according to 
Genebank sequence entry HUMCFOT , ACCESSION # V01512) . In 
addition, a weaker element was identified in the coding 
region. 

To eliminate these INS the following 
oligonucleotides were synthesized and are used in 
mutagenesis experiments as specified supra. 

To eliminate the INS in the fos non- coding 
region, the following oligonucleotides, which make changes 
in the fos non- coding region between nucleotides [3328- 
3450] (the letters in lower case indicate mutated 
nucleotides) , were synthesized and are used to mutagen! ze 
fragment [3328-3450] : mutagenesis experiments as specified 
supra : 



#1: 

3349-3391 

TGAAAACGTTcgcaTGTGTcgcTAcgTTgcTTAcTAAGATGGA (SEQ ID NO- 
16) 



30 



35 



#2: 

3392-3434 

TTCTCAGATAccTAgcTTcaTATTgccTTaTTgTCTACCTTGA (SEQ ID NO- 
17) 

These oligonucleotides are used to mutagenize 
fos fragment [3328-3450] inserted into vectors pl9 , 
P17M1234 or p37M1234 and the expression of the resulting 
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plasmids are analyzed after transfection into HLtat cells. 

The expression of fos is expected to be 
increased by the elimination of this INS region. 

To further define and eliminate the INS elements 
in the coding region, additional longer fragments of fos 
5 are introduced into vector p37M1234. The INS element in 
the coding region is first mapped more precisely using 
this expression vector and is then corrected using the 
following oligonucleotides : 

10 #1 

2721-2770 

GCCCTGTGAGtaGGCActGAAGGacAGcCAtaCGtaACatACAAGTGCCA (SEQ ID 
NO: 18) 

15 #2 

2670-2720 

AGCAGCAGCAATGAaCCTagt agcGAtagcCTgAGt agcCCt ACGCTGCTG ( SEQ 
ID NO : 19) 

20 # 3 

2620-2669 

ACCCCGAGGCaGAtagCTTtCCatccTGcGCtGCcGCtCACCGCAAGGGC (SEQ ID 
NO: 20) 

25 # 4 

2502-2562 

CTGCACAGTGGaagCCTcGGaATGGGcCCtATGGCtACcGAatTGGAaCCaCTGTGCA 
CTC (SEQ ID NO: 21) 

30 Th e expression of fos is expected to be 

increased by the elimination of this INS region. 
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EXAMPLE 4 
HIV- 1 POL GENE 
Vector p37Ml234 was used to eliminate an 
inhibitory/ instability sequence from the pol gene of HIV-l 
which had been characterized by AW Cochrane et al . , 
5 "Identification and characterization of intragenic' 
sequences which repress human immunodeficiency virus 
structural gene expression" , J. Virol. 65:5305-5313 
(1991) . These investigators suggested that a region in 
pol (HIV nucleotides 3792-4052), termed CRS, was important 
for inhibition. A larger fragment spanning this region, 
which contained nucleotides 3700-4194, was inserted into 
the vector p37M1234 and its effects on the expression of 
P37gag from the resulting plasmid (plasmid p37M1234RCRS) 
(see Fig. 12) was analyzed after transfection into HLtat 
15 cells. 

Severe inhibition of gag expression (10 fold, 
see Fig. 13) was observed. 

In an effort to eliminate this INS, the 
following oligonucleotides were synthesized (the letters 
in lower case indicated mutated nucleotides) and used in 
mutagenesis experiments . 

First, it was observed that one AUUUA potential 
instability element was within the INS region. This was 
eliminated by mutagenesis using oligonucleotide MIOpol and 
resulted in plasmid p37M1234RCRSP10 . The expression of 
gag from this plasmid was not improved, demonstrating that 
elimination of the AUUUA element alone did not eliminate 
the INS. See Fig. 12. Therefore, additional mutagenesis 
was performed and it was shown that a combination of 
mutations introduced in plasmid p37M1234RCRS was necessary 
and sufficient to produce high 'levels of gag proteins, 
which were similar to the plasmid lacking CRS. The 
mutations necessary for the elimination of the INS are 
shown in Fig. 13. 

The above results demonstrate that HIV-l pol 
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contains INS elements that can be detected and eliminated 
with the techniques described. 

These results also suggest that regions outside 
of the minimal inhibitory region in CRS as defined by A.W. 
Cochrane et al., supra, influence the levels of 
5 expression. These results suggest that the RNA structure 
of the region is important for the inhibition of 
expression. 

Table 1 

10 Correspondence between Sequence 

Identification Numbe rs and Nucleotides in Figure 4 

Sequence ID Nos. 
SEQ ID NO:l 
SEQ ID NO: 2 
SEQ ID NO: 3 
15 SEQ ID NO: 4 
SEQ ID NO: 5 
SEQ ID NO: 6 
SEQ ID NO: 7 
SEQ ID NO: 8 
SEQ ID NO: 9 

20 Table 2 

Synthetic oligonucleotides used 
in the mutagenesis of HIV-l gag and pol regions 

The upper sequence is the wild- type HIV-l as 
found in HIV^r while the bottom is the mutant 
25 oligonucleotide sequence. The location of the sequence is 
indicated in parentheses. 



Figure 4 
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536- 
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below 


line 


(M2) 


nucleotides 


585- 


•634, 


below 


line 


(M3) 


nucleotides 


654- 


-703, 


below 


line 


(M4) 



M5gag (778-824) 

CACCTAGAACTTTAAATGCATGGGTAAAAGTAGTAGAAGAGAAGGCT (SEQ ID 
NO: 22) 

30 XX X X X XXX 

CACCTAGAACccTgAAcGCcTGGGTgAAgGTgGTAGAAGAGAAGGCT (SEQ ID 
NO t 23 ) 
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30 



M6gag (871-915) 

CCACCCCACAAGATTTAAACACCATGCTAAACACAGTGGGGGGAC (SEQ ID NO: 

X XX X XXX X 

CCACCCCACAgGAccTgAACACgATGtTgAACACcGTGGGGGGAC (SEQ ID NO: 



M7gag (1105-1139) 

CAGTAGGAGAAATTTATAAAAGATGGATAATCCTG (SEQ ID NO: 26) 
X X X X X 

CAGTAGGAGAgATcTAcAAGAGgTGGATAATCCTG (SEQ ID NO: 27) 
M8gag (1140-1175) 

GGATTAAATAAAATAGTAAGAATGTATAGCCCTACC (SEQ ID NO: 28) 
X X X X X X 

GGATTgAAcAAgATcGTgAGgATGTATAGCCCTACC (SEQ ID NO: 29) 



M9gag (1228-1268) 
ACCGGTTCTATAAAACT 

XXX jn^i, a A 

ACCGGTTCTAcAAgACcCTgcGgGCtGAGCAAGCTTCACAG (SEQ ID NO: 31) 



ACCGGTTCTATAAAACTCTAAGAGCCGAGCAAGCTTCACAG (SEQ ID NO: 30) 
15 XXX XX X X 



MIOgag (1321-1364) 

ATTGTAAGACTATTTTAAAAGCATTGGGACCAGCGGCTACACTA (SEQ ID NO: 
on X XX X X XX X X 

- ATTGTAAGACcATcCTgAAgGCtcTcGGcCCAGCGGCTACACTA (SEQ ID NO: 
Mllgag (1416-1466) 

AGAGmTGGCTGAAGCAATGAGCCAAGTAACAAATTCAGCTACCATAATG ( SEQ 

AGAGTOTTG^ ( SEQ 



M12gag (1470-1520) 

S G NO?^6^ TmA ^^ CC (SEQ 

X XX XX X XX 

S G Nof1^ ( SE °- 
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M13gag (1527-1574) 

GAAGGGCACATAGCCAGAAATTGCAGGGCCCCTAGGAAAAAGGGCTGT (SEQ ID 
NO : 3 8 ) 

XXX XX X 

GAAGGGCACAC cGC CAGgAAcTGC cGGGC C CCc cGGAAgAAGGG CTGT (SEQ ID 
NO: 39) 



M14gag (1581-1631) 

TGT^AAAGGAAGGACACCAAATGAAAGATTGTACTGAGAGACAGGCTAAT ( SEQ 

XX X xxxx xx 

TGTGGAAAGGAgGGgCACCAgATGAAgGAcTGcACgGAGcGgCAGGCTAAT (SEQ 
ID NO: 41) 

MOpol (1823-1879) (K to R difference introduced) 
"^ATAAAGATAGGGGGGCAACTAAAGGAAGCTCTATTi! 

XXX X X XX X 



MOpol (1823-1879) (K to R difference introduced) 

CCCCTGGTCACAATAAAGATAGGGGGGCAACTAAAGGAA.GCTCTATTAGATACAGGAG 
(SEQ ID NO: 42) 

XXX X X XX X 

^SEQ T ID T ^^43^ SG ^^ 

Mlpol (1936-1987) 

GATAGGGGGAATTGGAGGTTTTATCAAAGTAAGACAGTATGATCAGATACTC ( SEO 
ID NO : 44) 

XX X XX XXX XX 

GATAGGGGGgATcGGgGGcTTcATCAAgGTgAGgCAGTAcGAcCAGATACTC (SEQ 
ID NO: 45) 

M2pol (2105-2152) 

CCTATOGAGACTGTACCAGTAAAATTAAAGCCAGGAATGGATGGCCCA (SEQ ID 

X XXXXX XX 
CCTATTGAGACgGTgCCcGTgAAgTTgAAGCCgGGgATGGATGGCCCA (SEQ ID 

M3.2pol (2162-2216) 

C^TGGCCATTGACAGAAGAAAAAATAAAAGCATTAGTAGAAATTTGTACAGAGA 
(SEQ ID NO: 48) 

X XXXXX X X 

30 ^^^TTGACgGAAGAgAAgATcAAgGCcTTAGTcGAAATcTGTACAGAGA 
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M4pol (2465-2515) 

2°?!?^ ( SEQ 

XXXX XXXX X 

i£ CAGG ^ G f^ cACg ^ ( SEQ 

M5pol (2873-2921) 

TTAGTGGGGAAATTGAATTGGGCAAGTCAGATTTACCCAGGGATTAAAG (SEQ ID 

XX X XX X X 

TTAGTGGGGAAggTGAAcTGGGCgAGcCAGATcTACCCgGGGATTAAAG (SEQ ID 

M6pol (3 09 8-3150) 

lS C NO AT 5 iG t ^^ TATC ^ TTTAT ^ GAGC ^ mAA ^TCTGAAAACAGG (SEQ 

XXXXXX XXXX 
?D CCAAT 55f CgTAcCA 5 ATcTAcC AgGAGCCgTTcAAgAAcCTGAAAACAGG ( SEQ 

M7pol (3242-3290) 

So GG 5 AAAGACTCCT ^ TOT ^ CTGCC ^ TACA ^ {SE <2 ID 

XX X XX XX X 

TGGGGAAAGACgCCgAAgTTcAAgCTGCCCATcCAgAAGGAgACATGGG (SEQ ID 

M8pol (3520-3569) 

GAAGACTCAGTTACAAGC^TTTATCTAGCTTTGCAGGATTCGGGATTAG (SEQ ID 



XXXXXXXXX X 
25 Jjgf S }£f Mc ^^ (SEQ ID 

M8.2pol (3643-3698) 
30 tfilZg 3 *^ 
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M9pol (3749-3800) 

GTCAGTGCTGGAATCAGGAAAGTACTATTTTTAGATGGAATAGATAAGGCCC (SEO 
ID NO: 62) 

XX XX XXXXXX 

GTCAGTGCTGGgATCcGGAAgGTgCTATTccTgGAcGGgATcGATAAGGCCC ( SEQ 
ID NO: 63) 

M9.2pol (3806-3863) 

GAACATGAGAAATATCACAGTAATTGGAGAGCAATGGCTAGTGATTTTAACCTGCCAC 
(SEQ ID NO: 64) ~ 

XX XXXX XXX xxxx 

GAACATGAGAAgTAcCACtccAAcTGGcGcGCtATGGCcAGcGAcTTcAACCTGCCAC 
(SEQ ID NO: 65) 

KHOpol (3950-4001) 

GGAATATGGCAACTAGATTGTACACATTTAGAAGGAAAAGTTATCCTGGTAG ( SEQ 
ID NO: 66) 

XXXXXXXXXXX X 
^^TATGGCAgCTgGAcTGcACgCAccTgGAgGGgAAgGTgATCCTGGTAG (SEQ 

Mllpol (4031-4096) 

GCAGAAGTTATTCCAGCAGAAACAGGGCAGGAAACAGCATATTTTCTTTTAAAATTA 
-CAGGAAGA (SEQ ID NO: 68) 

XXX X XXXXXXXXXX 

GCAGAAGTTATcCCtGCtGAAACtGGGCAGGAgACcGCcTAcTTcCTcrcTcAAAcTcG 
-CAGGAAGA (SEQ ID NO: 69) 

M12pol (4097-4151) 

TGGCCAGTAAAAACAATACATACTGACAATGGCAGCAATTTCACCGGTGCTACGG 
(SEQ ID NO: 70) ~ 

XXXXXX XX X X 

TGGCCAGTgAAgACgATcCAcACgGACAAcGGaAGCAAcTTCACtGGTGCTACGG 
25 (SEQ ID NO: 71) 

M13pol (4220-4271) 

GGAGTAGTAGAATCTATGAATAAAGAATTAAAGAAAATTATAGGACAGGTAA ( SEQ 
ID NO: 72) 

X XXXX XXX 

GGAGTAGTAGAATCcATGAAcAAgGAAcTgAAGAAgATcATcGGACAGGTAA (SEQ 
ID NO: 73) 

M12pol-p (4097-4151) (indicates the sequence found in 
p37M1234RCRSP10+P12p 

TGGCCAGTAAAAACAATACAcACgGACAAcGGaAGCAAcTTCACtGGTGCTACGG 
35 (SEQ ID NO: 74) 
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Sequences of mutant oligos designed 
to eliminate t.hi* INS Pffgct of fragment-. F 

The six oligonucleotides used to eliminate the 
INS effect of fragment F (oligos #1 to #6) are set forth 
above in Example 2 (SEQ. ID. NOS . 10-15). 



Table 4 

10 J Sequence of mutant oligos designed to 

destroy INS elements w-ii-h-ir. hh a 

env coding region 

The wildtype (top) and the mutant oligo (below) 
of 26 different regions are shown, 
mutant oligos for env of HTV-i » 
15 Ml (5834-5878) 46-mer 

CTTGGGATGTTGATGATCTGTAGTGCTACAGAAAAATTGTGGGTC (SEQ ID NO: 

x xxxxxxxx 

CTTGGGATGcTGATGATcTGcAGcGCcACcGAgAAgcTGTGGGTC (SEQ ID NO: 

20 M2 (5886-5908) 24-mer 

ATTATGGGGTACCTGTGTGGAAG (SEQ ID NO: 77) 
XXX " 

ATTATGGcGTgCCcGTGTGGAAG (SEQ ID NO: 78) 

M3 (5920-5956) 38-mer 
^ CACTCTATTTTGTGCATCAGATGCTAAAGCATATGAT (SEQ ID NO: 79) 

CACTCTATTcTGcGCcTCcGAcGCcAAgGCATATGAT (SEQ ID NO: 80) 
M4 (5957-5982) 27-mer 

ACAGAGGTACATAATGTTTGGGCCAC (SEQ ID NO- 81) 
X X X X 

ACAGAGGTgCAcAAcGTcTGGGCCAC (SEQ ID NO: 82) 
M5 (6006-6057) 53-mer 

^^ C t3^ G ^ GTAGTATT ^ T ^ TGTGA ^ < SE( 3 

xxxxxx XX xxxx 

C D^§ CC ^ 9GA9GTgGTgCTGGTg ^ CGTGA ^ ( SEQ 
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M6 (6135-6179) 46-mer 

TAACCCCACTCTGTGTTAGTTTAAAGTGCACTGATTTGAAGAATG (SEQ ID NO: 
85) "~ 

X X X XX X X XX 

TAACCCCcCTCTGcGTgAGccTgAAGTGCACcGAccTGAAGAATG (SEQ ID NO: 

M7 (6251-6280) 31-mer 

ATCAGCACAAGCATAAGAGGTAAGGTGCAG (SEQ ID NO- 87) 

X XX X X 

ATCAGCACcAGCATccGcGGcAAGGTGCAG (SEQ ID NO: 88) 

M8 (6284-6316) 34-mer 

GAATATGCATTTTTTTATAAACTTGATATAATA (SEQ ID NO- 89) 

X X X X X X 
GAATATGCcTTcTTcTAcAAgCTgGATATAATA (SEQ ID NO: 90) 

M9 (6317-6343) (28-mer) 

CCAATAGATAATGATACTACCAGCTAT (SEQ ID NO- 91) 
X X X X 

CCAATAGcTAAgGAcACcACCAGCTAT (SEQ ID NO: 92) 

15 M10 (6425-6 469) (46-mer) 

GCCCCGGCTGGTTTTGCGATTCTAAAATGTAATAATAAGACGTTC (SEQ ID NO- 
93) ' 

XXX xxxxxx 
GCCCCGGCcGGcTTcGCGATcCTgAAgTGcAAcAAcAAGACGTTC (SEQ ID NO: 
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Mil (6542-6583) (42-mer) 

CAACTGCTGTTAAATGGCAGTCTAGCAGAAGAAGAGGTAGTA (SEQ ID NO- 95) 
XXX XXXXX 

CAACTGCTGcTgAAcGGCAGcCTgGCcGAgGAgGAGGTAGTA (SEQ ID NO: 96) 
M12 (6590-6624) (35-mer) 

TCTGTCAATTTCACGGACAATGCTAAAACCATAAT (SEQ ID NO- 97) 
XXXXX 
25 TCTGCGAAcTTCACcGACAAcGCcAAgACCATAAT (SEQ ID NO: 98) 

M13 (6632-6663) (32-mer) 

CTGAACACATCTGTAGAAATTAATTGTACAAG (SEQ ID NO- 99) 

X X X X- X X 
CTGAACCAgTCcGTgGAgATcAAcTGTACAAG (SEQ ID NO: 100) 

M14 (6667-6697) (31-mer) 

JU CAACAACAATACAAGAAAAAGAATCCGTATC (SEQ ID NO- 101) 
X X X XX X 

CAACAACAAcACcGGcAAgcGcATCCGTATC (SEQ ID NO: 102) 
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M15 (6806-6852) (47-mer) 

GCTAGCAAATTAAGAGAACAATT^ (SEQ ID 

XXXXXXXXXXXXX 
NoT A ?04t gCTgCGCGAgCAgTACGGg ^ C ^ C ^ gACcAT ^ TCTT (SEQ ID 

M16 (nt 6917-6961) (45-mer) 

TTCTACTGTAATTCAACACAACTGTTTAATAGTACTTGGTTTAAT (SEQ ID NO: 

X X X X X XXXX 
TTCTACTGgAAcTCcACcCAgCTGTTcAAcAGcACcTGGTTTAAT (SEQ ID NO: 

M17 (nt 7006-7048) (43-mer) 

CACAATCACCCTCCCATGCAGAATAAAACAAATTATAAACATG (SEQ ID NO: 

XXX XXXXXX 
CACAATCACcCTgCCcTGCcGcATcAAgCAgATcATAAACATG (SEQ ID NO: 

M18 (nt 7084-7129) (46-mer) 

CATCAGTGGACAAATTAGATGTTCATCAAATATTACAGGGCTGCTA (SEQ ID NO: 

XXXXXXXXXXX 
CATCAGCGGcCAgATccGcTGcTCcTCcAAcATcACcGGGCTGCTA (SEQ ID NO: 

M19 (nt 7195-7252) (58-mer) 
20 ^Q^^*^!j^^^^ 

X X X XX xxxxxxxxx 
M20 (nt 7594-7633) (40-mer) 

GCCTTGGAATGCTAGTTGGAGTAATAAATCTCTGGAACAG (SEQ ID NO- 113) 
25 XXX XXXX 

GCCTTGGAAcGCcAGcTGGAGcAAcAAgTCcCTGGAACAG (SEQ ID NO: 114) 
M21 (nt 7658-7689) (32-mer) 

GAGTGGGACAGAGAAATTAACAATTACACAAG (SEQ ID NO: 115) 
GAGTGGGACcGcGAgATcAACAAcTACACAAG (SEQ ID NO: 116) 
M22 (nt 7694-7741) (48-mer) 

ATAC^CTCCTTAATT^^ (SEQ ID 

X X X X X X X XXX 
ATACACTCCcTgATcGAgGAgTCcCAgAAGCAgCAgGAgAAGAATGAA (SEQ ID 
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M23 (nt 7954-7993) (40-mer) 

CAGGCCCGAAGGAATAGAAGAAGAAGGTGGAGAGAGAGAC (SEQ ID NO: 119) 

XXXXXXXX 
CAGGCCCGAgGGcATcGAgGAgGAgGGcGGcGAGAGAGAC (SEQ ID NO: 120) 

M24 (nt 8072-8121) (50-mer) 

TACCIACCGCTTGAGAGACTTACTCTTGATTGTAACGAGGATTGTGGAACT (SEQ ID 
NO: 121) 

XXXXXX XX X 

TACCACCGCcTGcGcGACcTgCTCcTGATcGTgACGAGGATcGTGGAACT (SEQ ID 
NO: 122) 

M25 (nt 8136-8179) (44-mer) 

GGTGGGAAGCCCTCAAATATTGGTGGAATCTCCTACAGTATTGG (SEQ ID NO: 

H-2 3 ) 

X XX XX 

GGTGGGAgGCCCTCAAgTAcTGGTGGAAcCTCCTcCAGTATTGG (SEQ ID NO: 



M26 (nt 8180-8219) (40-mer) 

AGTCAGGAACTAAAGAATAGTGCTGTTAGCTTGCTCAATG (SEQ ID NO: 125) 
XX XXXXXX 
15 AGTCAGGAgCTgAAGAAcAGcGCcGTgAaCcTGCTCAATG (SEQ ID NO: 126) 

Comments; 

Although the vast majority of oligonucleotides 
follow the HXB2 sequence, some exceptions are noted: 

In oligo M15, nt 6807 follows the pNL43 
sequence. (Specifically, nt 6807 is C in NL43 but A in 
HBX2.) Oligo M26 has the nucleotide sequence derived from 
pNL43. 

25 

EXAMPLE 5 

USE OF OR P37M1-10D OR P55M1-13P0 IN 
IMMUNOPROPHYLAXIS OR IMMUNOTHERAPY 

In postnatal gene therapy, new genetic 
30 information has been introduced into tissues by indirect 
means such as removing target cells from the body, 
infecting them with viral vectors carrying the new genetic 
information, and then reimplanting them into the body; or 
by direct means such as encapsulating formulations of DNA 
35 in liposomes; entrapping DNA in proteoliposomes containing 
viral envelope receptor proteins; calcium phosphate co- 
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precipitating DNA; and coupling DNA to a polylysine- 
glycoprotein carrier complex. in addition, in vivo 
infectivity of cloned viral DNA sequences after direct 
intrahepatic injection with or without formation of 
calcium phosphate coprecipitates has also been described. 
mRNA sequences containing elements that enhance stability 
have also been shown to be efficiently translated in 
Xenopus laevis embryos, with the use of cationic lipid 
vesicles. See, e.g., J.A. Wolff, et al., Science 
247:1465-1468 (1990) and references cited therein. 

Recently, it has also been shown that injection 
of pure RNA or DNA directly into skeletal muscle results 
in significant expression of genes within the muscle 
cells. J.A. Wolff, et al., Science 247:1465-1468 (1990). 
Forcing RNA or DNA introduced into muscle cells by other 
15 means such as by particle-acceleration (N. -S. Yang, et 
al - Proc. Natl. Acad. Sej , TTSfl 87:9568-9572 (1990); S.R. 
Williams et al., Proc. Nat l. Acad. Sci. TTfiA 88:2726-2730 
(1991) ) or by viral transduction should also allow the DNA 
or RNA to be stably maintained and expressed. In the 
20 experiments reported in Wolff et al . , RNA or DNA vectors 
were used to express reporter genes in mouse skeletal 
muscle cells, specifically cells of the quadriceps 
muscles. Protein expression was readily detected and no 
special delivery system was required for these effects. 
25 Polynucleotide expression was also obtained when the 
composition and volume of the injection fluid and. the 
method of injection were modified from the described 
protocol. For example, reporter enzyme activity was 
reported to have been observed with 10 to 100 jil of 
hypotonic, isotonic, and hypertonic sucrose solutions, 
Opt i -MEM, or sucrose solutions containing 2mM CaCl 2 and 
also to have been observed when the 10- to 100- fil 
injections were performed over 20 min. with a pump instead 
of within 1 min. 

Enzymatic activity from the protein encoded by 
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the reporter gene was also detected in abdominal muscle 
injected with the RNA or DNA vectors, indicating that 
other muscles can take up and express polynucleotides. 
Low amounts of reporter enzyme were also detected in other 
tissues (liver, spleen, skin, lung, brain and blood) 
injected with the RNA and DNA vectors. Intramuscularly 
injected plasmid DNA has also been demonstrated to be 
stably expressed in non-human primate muscle. S. Jiao et 
al. f Hum. Ge ne Therapy 3:21-33 (1992). 

It has been proposed that the direct transfer of 
genes into human muscle in situ may have several potential 
clinical applications. Muscle is potentially a suitable 
tissue for the heterologous expression of a transgene that 
would modify disease states in which muscle is not 
primarily involved, in addition to those in which it is. 
For example, muscle tissue could be used for the 
heterologous expression of proteins that can immunize, be 
secreted in the blood, or clear a circulating toxic 
metabolite. The use of RNA and a tissue that can be 
repetitively accessed might be useful for a reversible 
type of gene transfer, administered much like conventional 
pharmaceutical treatments. See J. a. Wolff, et al., 
Science 247:1465-1468 (1990) and S. Jiao et al . , Hum. Gene 
Therapy. 3:21-33 (1992) . 

It had been proposed by J. A. Wolff et al., 
supra, that the intracellular expression of genes encoding 
antigens might provide alternative approaches to vaccine 
development. This hypothesis has been supported by a 
recent report that plasmid DNA encoding influenza A 
nucleoprotein injected into the quadriceps of BALB/c mice 
resulted in the generation of influenza A nucleoprotein- 
specific cytotoxic T lymphocytes (CTLs) and protection 
from a subsequent challenge with a heterologous strain of 
influenza A virus, as measured by decreased viral lung 
titers, inhibition of mass loss, and increased survival. 
J. B. Ulmer et al.. Science 259:1745-1749 (1993). 
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Therefore, it appears that the direct injection 
of RNA or DNA vectors encoding the viral antigen can be 
used for endogenous expression of the antigen to generate 
the viral antigen for presentation to the immune system 
without the need for self -replicating agents or adjuvants, 
resulting in the generation of antigen- specif ic CTLs and 
protection from a subsequent challenge with a homologous 
or heterologous strain of virus. 

CTLs in both mice and humans are capable of 
recognizing epitopes derived from conserved internal viral 
proteins and are thought to be important in the immune 
response against viruses. By recognition of epitopes from 
conserved viral proteins, CTLs may provide cross -strain 
protection. CTLs specific for conserved viral antigens 
can respond to different strains of virus, in contrast to 
antibodies, which are generally strain- specif ic . 

Thus, direct injection of RNA or DNA encoding 
the viral antigen has the advantage of being without some 
of the limitations of direct peptide delivery or viral 
vectors. S^e J. A. Ulmer et al., supra , and the 
discussions and references therein) . Furthermore, the 
generation of high- titer antibodies to expressed proteins 
after injection of DNA indicates that this may be a facile 
and effective means of making antibody-based vaccines 
targeted towards conserved or non- conserved antigens, 
either separately or in combination with CTL vaccines 
targeted towards conserved antigens. These may also be 
used with traditional peptide vaccines, for the generation 
of combination vaccines. Furthermore, because protein 
expression is maintained after DNA injection, the 
persistence of B and T cell memory may be enhanced, 
thereby engendering long-lived humoral and cell -mediated 
immunity. 
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1. Vectors for the immunoprophylaxis or 

immunotherapy wainat- ffnr-i 

The mutated gag genomic sequences in vectors 
P37M1-10D or P55M1-13P0 (Fig. 6) will be inserted in 
expression vectors using a strong constitutive promoter 
such as CMV or RSV, or an inducible promoter such as 
HIV-l. 

The vector will be introduced into animals or 
humans in a pharmaceutical^ acceptable carrier using one 
of several techniques such as injection of DNA directly 
into human tissues; electroporation or transfection of the 
DNA into primary human cells in culture (ex vivo ) , 
selection of cells for desired properties and 
reintroduction of such cells into the body, (said 
selection can be for the successful homologous 
recombination of the incoming DNA to an appropriate 
preselected genomic region) ; generation of infectious 
particles containing the gag gene, infection of cells ex 
vivo and reintroduction of such cells into the body; or 
direct infection by said particles in vivo. 

Substantial levels of protein will be produced 
leading to an efficient stimulation of the immune system. 

In another embodiment of the invention, the 
described constructs will be modified to express mutated 
gag proteins that are unable to participate in virus 
particle formation, it is expected that such gag proteins 
will stimulate the immune system to the same extent as the 
wild- type gag protein, but be unable to contribute to 
increased HIV-l production. This modification should 
result in safer vectors for immunotherapy and 
irnmunophrophylaxis . 
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EXAMPLE 6 



IN ?iS ITI0N ° F HIV_1 EXPRESSION USING TRANSDOMINANT 
(TD)- TD-GAG-TD REV OR TD GAG- PRO -TD REV GENES 

5 Direct injection of DNA or use of vectors other 

than retroviral vectors will allow the constitutive high 
level of trans -dominant gag (TDgag) in cells. In 
addition, the approach taken by B.K. Felber et al., 
Science 239:184-187 (1988) will allow the generation of 

lQ retroviral vectors, e.g. mouse-derived retroviral vectors, 
encoding HIV-l TDgag, which will not interfere with the 
infection of human cells by the retroviral vectors. In 
the approach of Felber, et al., supra, it was shown that 
fragments of the HIV-l LTR containing the promoter and 

i5 part of the polyA signal can be incorporated without 

detrimental effects within mouse retroviral vectors and 
remain transcriptionally silent. The presence of Tat 
protein stimulated transcription from the HIV-l LTR and 
resulted in the high level expression of genes linked to 

2Q the HIV-l LTR. 

The generation of hybrid TDgag -TDRev or TDgag - 
pro-TDRev genes and the introduction of expression vectors 
in human cells will allow the efficient production of two 
proteins that will inhibit HIV-l expression. The 

25 incorporation of two TD proteins in the same vector is 
expected to amplify the effects of each one on viral 
replication. The use of the HIV-l promoter in a matter 
similar to one described in B.K. Felber, et al., supra, 
will allow high level gag and rev expression in infected 

3Q cells. In the absence of infection, expression will be 
substantially lower. Alternatively, the use of other 
strong promoters will allow the constitutive expression of 
such proteins. This approach could be highly beneficial, 
because of the production of a highly immunogenic gag, 

35 which is not able to participate in the production of 
infectious virus, but which, in fact, antagonizes such 
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production. This can be used as an efficient 
immuniprophylactic or immunotherapeutic approach against 
AIDS . 

Examples of trans -dominant mutants are described 
in Trono et al., Cell 59:112-120 (1989). 

5 

1. Generation of constructs encoding 
transdominant cracr muta nt proteins 

Gag mutant proteins that can act as trans - 
dominant mutants, as described, for example, in Trono et 
10 a1 -* supra, will be generated by modifying vector 

P37M1-10D or p55Ml-13P0 to produce transdominant gag 
proteins at high constitutive levels. 

The transdominant gag protein will stimulate the 
immune system and will inhibit the production of 
15 infectious virus, but will not contribute to the 
production of infectious virus. 

The added safety of this approach makes it more 
acceptable for human application. 



20 
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30 



Those skilled in the art will recognize that any 
gene encoding a mRNA containing an inhibitory/instability 
sequence or sequences can be modified in accordance with 
the exemplified methods of this invention or their 
functional equivalents. 

Modifications of the above described modes for 
carrying out the invention that are obvious to those of 
skill in the fields of genetic engineering, protein 
chemistry, medicine, and related fields are intended to be 
within the scope of the following claims. 

Every reference cited hereinbefore is hereby 
incorporated by reference in its entirety. 
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WHAT IS CLAIMED IS : 

1. A method for reducing the effect of 
inhibitory/ instability sequences within the coding region 
of a mRNA, said method comprising the steps of: 

(a) providing a gene which encodes said 
5 mRNA; 

(b) identifying the inhibitory/ instability 
sequences within said gene which 
encode said inhibitory/ instability 
sequences within the coding region of 

10 said mRNA; 

(c) mutating said inhibitory/instability 
sequences within said gene by making 
multiple point mutations; 

(d) transfecting said mutated gene into a 
15 cell; 

(e) culturing said cell in a manner to 
cause expression of said mutated gene; 

(f) detecting the level of expression of 
said gene to determine whether the 

20 effect of said inhibitory/instability 

sequences within the coding region of 
the mRNA has been reduced. 

2 . The method of Claim 1 further comprising 
25 the step of fusing said mutated gene to a reporter gene 

prior to said transfecting step and said detecting step is 
performed by detecting the level of expression of said 
reporter gene. 

30 3 . The method of Claim 1 wherein step (b) 

further comprises the steps of 

(a) fusing said gene or fragments of said 
gene to a reporter gene to create a 
fused gene; 

35 (b) transfecting said fused gene into a 
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cell; 

culturing said cell in a manner to 
cause expression of said fused gene; 
detectlng the level of expression of 
5 said fused gene to determine whether 

the expression of said fused gene is 
reduced relative to the expression of 
said reporter gene. 

10 4 * The metilod °f Claim 3 wherein step (a) 

iU comprises fusing said crene or f ram , fl „, « . 

a ^ ene or fragments of said gene 3' to 

the stop codon of said reporter gene. 

5. The method of Claim 3 wherein step (a) 
comprises fusing said gene or fragments of said gene in 
frame with the 3' end of the coding region of said 
reporter gene. 

6. The method of Claim 1 or 2 wherein said 

20 "" tatlng StSp Chan * es ^e codons such that the amino acid 
sequence encoded by the mRNA is unchanged. 

7. The method of Claim 6 wherein said 

sSdlutT 1113 ^ 111 ^ SeqUenCeS ^ AT " rich and " h *~in 
said mutating step comprises substituting either G or C 

for either A or T and wherein the final nucleotide 
composition of said mutat-^r? ^v,ju • 

50% a mr* t „ mutated inhibitory sequence is about 
50% a and T and about 50% G and C. 

30 ^ meth ° d ° f Claim 6 wherein at least 75% 

of the point mutations replace conserved nucleotides with 
non- conserved nucleotides. 
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9 * ^ meth ° d ° f Claim * wherein said mutating 
step comprises substituting less preferred codons with 
more preferred codons. "" 
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10. The method of Claim 1 or 2 wherein said 
mRNA encodes the GAG protein of a Rev- dependent comple: 
retrovirus . 

11. The method of Claim 10 wherein the Rev- 
dependent complex retrovirus is human immunodeficiency 
virus - 1 . 



12. A method of increasing the production of a 
polypeptide, wherein said polypeptide is encoded by a mRNA 
that contains one or more inhibitory/ instability 
sequences, said method comprising the steps of: 

(a) providing a gene which encodes said 
mRNA; 

(b) identifying the inhibitory/ instability 
Xw sequences within said gene which 

encode said inhibitory/ instability 
sequences within the coding region of 
said mRNA; 

2q (c) mutating said inhibitory/instability 

sequences within said gene by making 
multiple point mutations; 
(d) transfecting said mutated gene into a 
cell; 

25 (e) culturing said cell in a manner to 

cause expression of said mutated gene; 

(f) detecting the level of expression of 
said gene to determine that the effect 
of said inhibitory/ instability 

' sequences within the coding region of 

the mRNA has been reduced; 

(g) providing a host cell transfected with 
an expression vector containing said 
mutated gene; 

35 (h) culturing said host cell to cause 

expression of said polypeptide; and 
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(i) recovering said polypeptide. 

13 . A method of producing polypeptides , whose 
native production is impeded by the presence of an 
inhibitory/instability sequence, comprising the steps of: 

Ca) providing a host cell transfected with 
an expression vector containing a gene 
encoding said polypeptide, said gene 
having been mutated to decrease the 
effect of the inhibitory/ instability 
sequence; 

(b) culturing said host cell to cause 
expression of said polypeptide; and 

(c) recovering said polypeptide. 

14. The method of Claim 13 wherein said host 
cell is prokaryotic. 

15. The method of Claim 13 wherein said host 
cell is eukaryotic. 

16. The method of Claims 13, 14 or 15 wherein 
said gene is a cDNA. 



17 - The method of Claims 13 , 14 or 15 wherein 
25 said gene is genomic. 

18. An artificial nucleic acid construct 
comprising a gene wherein the expression of the native 
gene is impeded by the presence of inhibitory/instability 
sequences in the mRNA encoded by said native gene, said 
gene having being mutated to decrease the effect of the 
inhibitory/ instability sequence. 
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19. The construct of Claim 18 wherein the amino 
acid sequence encoded by said mutated gene is the same as 
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the amino acid sequence encoded by the native gene. 

20. The construct of Claim 19 wherein said 
native gene is HIV-l gag. 

21. The construct of Claim 2 0 wherein said HIV- 
1 gag gene has been mutated by the introduction of 
multiple point mutations between nucleotides 402 and 452, 
536 and 583, 585 and 634, and 654 and 703. 

22. The construct of claim 19 wherein said 
native gene is HIV-l env. 



15 



20 



23. An assay kit for identifying 
inhibitory/ instability sequences in a mRNA, comprising: 

(a) the nucleic acid construct of Claim 20 
or 21; and 

(b) a detection system for detecting the 
level of expression of said gene in 
said nucleic acid construct. 

24. The kit of Claim 23 wherein said detection 
system is an ELISA. 
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25. An artificial nucleic acid construct 
comprising a gene mutated by the method of Claim l or 2 

26. a vector comprising the nucleic acid 
construct of Claim 25. 

27. A transformed host cell comprising the 
artificial nucleic acid construct of Claim 25. 

28. A vector comprising the nucleic acid 
construct of Claim 18 or 19. 
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29. A transformed host cell comprising the 
artificial nucleic acid construct of Claim 18 or 19. 

30. A transformed host cell of Claim 29 wherein 
said cell is selected from the group consisting of 
eukaryotes and prokaryotes. 

31. The host cell of Claim 30 wherein said cell 
is a human cell. 

32. The host cell of Claim 30 wherein said cell 
is a Chinese Hamster Ovary cell. 

33. The host cell of Claim 30 wherein said cell 
is E. coli . 



34. The construct of Claim 20 wherein said HIV- 
1 gag gene has been mutated by the introduction of 
multiple point mutations between nucleotides 402 and 452, 
536 and 583, 585 and 634, 654 and 703, 871 and 915, 1105 

20 and 1139, 1140 and 1175 and 1321 and 1364. 

35. The construct of Claim 34 wherein said HIV- 
1 gag gene is p37Ml-lOD. 

25 

36. The construct of Claim 20 wherein said HIV- 
l gag gene has been mutated by the introduction of 
multiple point mutations between nucleotides 402 and 452, 
536 and 583, 585 and 634, 654 and 703, 871 and 915, 1105 
and 1139, 1140 and 1175, 1321 and 1364, 1416 and 1466, 

30 1470 and 1520, 1527 and 1574, and 1823 and 1879. 

37. The construct of Claim 36 wherein said HIV- 
1 gag gene is p55Ml-l3P0. 
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38. A vaccine composition for inducing immunity 
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in a mammal against HIV infection comprising a 
pharmaceutically acceptable medium and further comprising 
a therapeutically effective amount of a nucleic acid 
construct capable of producing HIV gag protein in the 
absence of any HIV regulatory protein in a cell in vivo . 

39. A vaccine composition according to claim 38 
wherein said mammal is a human. 



10 



40. A vaccine composition according to claim 38 
wherein said regulatory protein is HIV-l Rev. 



41. A vaccine composition according to claim 3 8 
wherein said construct is selected from the group 
consisting of the construct of claim 20, 21, 34, 35, 36, 
15 and 37. 



42. A method for inducing immunity against HIV 
infection in a mammal which comprises administering to a 
mammal a therapeutically effective amount of a vaccine 

20 composition comprising a nucleic acid construct capable of 
producing HIV gag protein in the absence of any HIV 
regulatory protein in a cell in. vivo. 

43. A method according to claim 42 wherein said 
25 mammal is a human. 

44. A method according to claim 42 wherein said 
regulatory protein is HIV-l Rev. 



30 



45 . A method according to claim 42 wherein said 
construct is selected from the group consisting of the 
construct of claim 20, 21, 34, 35, 36, and 37." 
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Identification of INS regions within the 
env mRNA using the pl9 vector. 
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1 TGGAAGGGCT 

71 CTTCCCTGAT 

141 AAGTTAGTAC 

211 CTATGAGCCA 

281 ATTTCGTCAC 

351 GGACTTTCCG 

421 GCTACATATA 



AATTTGGTCC 
TGGCAGAACT 
CAGTTGAACC 
GCATGGGATG 
ATGGCCCGAG 
CTGGGGACTT 
AGCAGCTGCT 



CAAAAAAGAC 
ACACACCAGG 
AGAGCAAGTA 
GAGGACCCGG 
AGCTGCATCC 
TCCAGGGAGG 
TTTTGCCTGT 



AAGAGATCCT 
GCCAGGGATC 
GAAGAGGCCA 
AGGGAGAAGT 
GGAGTACTAC 
TGTGGCCTGG 
ACTGGGTCTC 



TGATCTGTGG 
AGATATCCAC 
AATAAGGAGA 
ATTAGTGTGG 
AAAGACTGCT 
GCGGGACTGG 
TCTGGTTAGA 



ATCTACCACA 
TGACCTTTGG 
GAAGAACAGC 
AAGTTTGACA 
GACATCGAGC 
GGAGTGGCGA 
CCAGATCTGA 



491 TCTCTGGCTA 
561 TGCCCGTCTG 
631 AGCAGTGGCG 



CACAAGGCTA 
ATGGTGCTTC 
TTGTTACACC 
GCCTCCTAGC 
TTTCTACAAG 
GCCCTCAGAT 
GCCTGGGAGC 



ACXAGGGAAC 
TTGTGTGACT 
CCCGAACAGG 



CCACTGCTTA 
CTGGTAACTA 
GACTTGAAAG 



AGCCTCAATA 
GAGATCCCTC 
CGAAAGTAAA 



AAGCTTGCCT 
AGACCCTTTT 
GCCAGAGGAG 



TGAGTGCTCA 
AGTCAGTGTG 
ATCTCTCGAC 



AAGTAGTGTG 
GAAAATCTCT 
GCAGGACTCG 



BssHH (711) 

70! GCTTGCTGAAGCGCGCGTCGACAGAGAGMGGGTGCt^ 

l>MetGl yAl aArgAI aSer Val LeuSer Gl yGI yGI uLeuAspArgTrp 

y ss erAsnGl nVal Ser Gl nAsnyProl leVal Gl nAs n 1 1 eGl nGI yGI nMat Va 

i m a I j eser P roArgThr LeuAsnAI aT rpVal LysVal Va! Gl uGl uLyi 



1157 
11> 



1233 
31> 



lGGCTTTCAGCCCAGAAGTG 
uLysAI aPheSer ProGJ uVal 



ULaJ yAl aThr ProGI nAspL«uAsnThrfwtet Lcu AsnThr Val Gl yGI yH 
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\ 6 ^. ™^f^ C CGS"CTATAAAACTCTAAGAGCTGAGCAAGCTTCACAGGAOTTAAAAA^ 
+63> P Ty rValAspArgPheTy rLysThr LeuArgAI aGI uGl nAI aSer Gl nGI uVal LysAsnTrplVbt Thr Gl uThr 

189> LeuLeuVal Gl nAsnAI aAsnProAspCysLysThr 1 1 eLsuLysAI aleuGI yProAl aA) aThr LeuGI uGl uM 
214> ethtetThr Al aCysGI nGI yVal Gl yGI yProGI yHi sLysAI aArgVal Lou 



1841 


Apal 

CGAGGGGGGG 


i (1848) 
CCCGGXACCT 


TTAAGACCAA TGACTTACAA GGCAGCTGTA GATCTTAGCC ACTTTTTAAA 


1211 


AGAAAAGGGG 


GGACTGGAAG GGCTAATTCA CTCCCAAAGA AGACAAGATA TCCTTGATCT 


GTGGATCTAC 


1981' 


CACACACAAG 


GCTACT7CCC 


TGATTGGCAG AACTACACAC CAGGGCCAGG GGTCAGATAT 


CCACTGACCT 


2051 


TTGGATGGTG 


CTACAAGCTA 


GTACCAGTTG AGCCAGATAA GGTAGAAGAG GCCAATAAAG 


GAGAGAACAC 


2121 


CAGCTTGTTA 


CACCCTGTGA GCCTGCATGG AATGGATGAC CCTGAGAGAG AAGTGTTAGA 




2191 


GACAGCCGCC TAGCATTTCA 


TCACGTGGCC CGAGAGCTGC ATCCGGAGTA CTTCAAGAAC 


TGCTGACATC 


2261 


GAGCTTGCTA 


CAAGGGACTT 


TCCGCTGGGG ACTTTCCAGG GAGGCGTGGC CTGGGCGGGA 


CTGGGGAGTG 


2331 


GCGAGCCCTC 


AGATGCTGCA 


TATAAGCAGC TGCTTTTTGC CTGTACTGGG TCTCTCTGGT 


TAGACCAGAT 


2401 


CTGAGCCTGG 


GAGCTCTCTG 


GCTAACTAGG GAACCCACTG CTTAAGCCTC AATAAAGCTT 




2471 


CTTCAAGTAG 


TGTGTGCCCG 


TCTGTTGTGT GACTCTGGTA ACTAGAGATC CCTCAGACCC 


TTTTAGTCAG 


2541 


TGTGGAAAAT 


CTCTAGCACC 


CCCCAGGAGG TAGAGGTTGC AGTGAGCCAA GATCGCGCCA 


CTGCATTCCA 



llll gS22S£ S^E™^ ^^ARTAAT AAGTTAAGGG TATTAAATAT ATTTATACAT 

2?S cSSSr^ %£lrlr£ ^ T ! TGGCCT GCTCACACCT GCGCCCGGCC CTTTGGGAGG 

tilt ^ TCACCT GAGTTTGGGA GTTCCAGACC AGCCTGACCA ACATGGAGAA ACCCCTTCTC 

llll SSSaI GTATT1TATT CACAGGTATT TCTGGAAAAC TGAAACTGTT 

llll SSgS£ CAGCACAGAG GAAGACTTCT GTGATCAAAT GTGGTGGGAG 

loll cSSSS SSS£ tr™^ C AGTTCTGCCG CAGACTCGGC GGGTGTCCTT CGGTTCAGTT 
3101 aS^Sc ^™ CCACAGGGTG AGGGCTCAGT CCCCAAGACA TAAACACCCA 

3171 GGaSS^a ™^ TCCACCCCGC CTGCTGCCCA GGCAGAGCCG ATTCACCAAG ACGGGAATTA 
52 r^^Ji ACACAGAGCC GGCTGTGCGG GAGAACGGAG TTCTATTATG ACTCAAATCA 

SIS SSSgS^ A^SS^ SSggSI I™** 3 ™* CTTAGTGTGT AGGGGGCCAG TGAGTTGGAG 
?2?5i5SE ^f GAGTCGA ASGTGTCCTT TTGCGCCGAG TCAGTTCCTG GGTGGGGGCC ACAAGATCGG 

SS Jc5gSgS ^ tgcca «m«ccw ggagtgcagg gtctgcaaaa tatctcaagc 

litl aSctSa^ acaatagtga tgttacccca ggaacaactt ggggaaggtc agaatcttgt 

3521 agcctgtagc tgcatgactc ctaaaccata atttcttttt tgtttttttt tttttatttt tgagacaggg 

36a cSScgSg SSSf^ ^? rGCAG TGGTGCRATC ACAGC1CACT 'gSSScCTA GAGCGGCCGC 

3731 SSSSH ™Ir^ ^GCCCTATA GTGAGTCGTA TTACAATTCA CTGGCCGTCG TTTTACAACG 

llll SS^S S^Sr ACTTAATCGC CTTGCAGCAC ATCCCCCTTT CGCCAGCTGG 

™ ^t?F CCG CACCGATCGC CCTTCCCAAC AGTTGCGCAG CCTGAATGGC GAATGGCGCG 

llll i££S£i TTGTTAAAAT TCGCGTTAAA TTTTTGTTAA ATCAGCTCAT TTTTTAACCA 

4oS gSSS^ ^™ TAA ATCAAARGAA TAGACCGAGA TAGGGTTGAG TGTTGTTCCA 

till SSSS^ ATTAAAGAAC GTGGACTCCA ACGTCAAAGG GCGAAAAACC GTCTATCAGG 

J?S SSSSS ^catcaccct *atcaagttt TTTGGGCTCG aggtgccgta aagcactaaa 

J2S CCCGATTTAG AGCTTGACGG GGAAAGCCGG CGAACGTGGC GAGAAAGGAA 

till ^SSS ™^ GGGCGCTAGG GCGCTGGCAA GTGTAGCGGT CACGCTGCGC GTAACCACCA 

till ^rr™ GCTTAATGCG CCGCTACAGG GCGCGTCCCA GGTGGCACTT TTCGGGGAAA TGTGCGCGGA 

4361 ACCCCTATTT GTTTATTTTT CTAAATACAT TCAAATATGT ATCCGCTCAT GAGACAATAA CCCTGATAAA 
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i£ ?gS£££ £££££ JKSS?* ACATTTCCGT GTCGCCCTTA TTCCCT^T 

4571 TTccCTPrap TTTTTGCTCA CCCAGAAACG CTGGTGAAAG TAAAAGATGC TGAAGATCAG 

till 3£S -ESSES ^ T f. GftftCTG GCGGTAAGAT CCTTGAGAGT TTTCGCCCCG 

4711 CGGGcS^ rz^™S AGCACTTTTA AAGTTCTGCT ATGTGGCGCG GTATTATCCC GTATTGACGC 
Till GAAaScA^ SISSS CTATTCTCAG AATGACTTGG TTGAGTACTC ACCAGTCACA 

48S1 CTGCGG^Ia ^SACASTA AGAGAATTAT GCAGTGCTGC CATAACCATG AGTGAIAACA 

«S S2552£ CTTACTTCTG ACAACGATCG GAGGACCGAA GGAGCTAACC GCTTTTTTGC ACAACATGGG 
«£ Sca££££ A I= GTTGGGA ACCGGAGCTG AATGAAGCCA OACCAAACGA CGAGCCTGAC 

r^S^T: CTGTAGCAAT GGCAACAACG TTGCGCAAAC 1ATTAACTGG CGAACTACTT ACTCTAGCTT 
I £S£2£S* ATTARTAGAC TGGATGGAGG CGGATAAAGT TGCAGGACCA CTTCTG^GCT CGgScSS 
5 5 201 SSEg SSSSr AGCC «^ CGTGGGTCTC SHJSSJ SSgSSg 

irrrf??^^ gtaagccctc ccgtatcgta gttatctaca cgacggggag TCAGGCAACT ATGGATGAAC 

HA ^™ GATCGCTGAG ATAGGTGCCT CACTGATTAA GCATTGGTAA CTGTCAGACC *S2£ 

5S TAGATTGATT XAAAACTTCA TTTTTAATTT AAAAGGATCT AGGTGAAGAT CCtoSSJ 

till i£22SS S^ 10 ^ TTTTCGTTCC ACTGAGCGTC AGACCCCgS GAAAAgISS 

«m A ^ A I?E!5 TTGAGATCCT TTTTTTCTGC GCGTAAXCTG CTGCTTGCAA ACAAAAAAAC CAcScTACC 

5551 AGCGGTGGTT TGTTTGCCGG ATCAAGAGCT ACCAACTCTT ITTCCGAAGG IAACTGGCTT S^SS^r 

a S ^ ^ Fss sssss ssss sssss 

° ^- GCTCGTCAG6 GGGGCGGAGC CTATGGAAAA ACGCCAGCAA CGCGGCCTTT TTACGGTrrr 

££ SS5S5 ™^ GT ^cctgc gttatcccct gSSSSS ££££££ 

6251 GAACCnSar S^S^S ATACCGCTCG CCGCAGCCGA ACGACCGAGC GCAGCGAGTC ACTGAGCGAG 
till SScS Ef^CCCG CGCGTTGGCC GATTCATTAA TGCAGCTGGC 

«qT ^ZVr^IirZ T CCCG aCTGG AAAGCGGGCA GTGAGCGCAA CGCAATTAAT GTGAGTTAGC TCACTCMTA 

SS ga^Sgga^ SSS SEgE? ttgtgtogaa S££££ SaaSa"™ 

SiTSs) ccatgattac gccaagctcg gaattaaccc tcactaaagg gaacaaaagc 

g JSSg! SSStffEg CCAAGCCCCA CAGTGTGCCC TGAGGCTGCC CCTTCCTTCT AGCGGCTGCC 
SE 2£5££ S52 25225 TCAGCCAAGG TCTGAAACTA GGTgSS 

6741 CAGCcSa^ JSSSSt E?SS£S S 61 ™ 68 SSS^TCA CAGTGCACCC TGACAGTCGT 
6811 ACcSScAA JSSSSS ^ CA Z!?S AC EE^Ef* 0 GTCAGCCTCA CAGGGGGTTT ATCACAGTGC 
6881 TGATCAGAGG JSgSSI TTTTTTTAGT CTCTACTGTG CCTAACTTGT AAGTTAAATT 

6951 cSSg^S S£SS££ ^r^*** CAGTRTATAC AGGGTTCAGT ACTATCGCAT TTCAGGCCTC 
7021 A^SaSS AcSSSr GGTGATGACT ACCTCAGTTG GATCTCCACA GGTCACAGTG 

7091 aSScSS TCG^SS 0 2S2£2S T******* GGCCGCCCTC CACGTGCACA TGGCCGGAGG 
7161 GCTTcSSc JSSSSS. KGCRTCAGA GTCCTTGGTG TGGAGGGAGG GACCAGCGCA 

GCTTCCAGCC ATCCACCTGA TGAACAGAAC CTAGGGAAAG CCCCAGTTCT ACTTACACCA GGAAAGGC 



